Re: [whatwg] Need clarification on DOM exceptions thrown by canvas 2D drawImage
On Mon, Aug 8, 2011 at 10:08 PM, Jeff Muizelaar jmuizel...@mozilla.com wrote: On 2011-08-08, at 4:58 PM, Ian Hickson wrote: On Mon, 8 Aug 2011, Justin Novosad wrote: This inquiry is regarding this page of the specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/the-canvas-element.html In section 4.8.11.1.10 Images, about drawImage(), it is stated that If one of the sw or sh arguments is zero, the implementation must raise an INDEX_SIZE_ERR exception There are no other references to other circumstances under which INDEX_SIZE_ERR should be thrown, and there is no indication of what the correct behavior is when the source rectangle is completely or partially outside the bounds of the source image. The spec used to throw exceptions on out-of-bounds source rectangles, but that causes breakage because floats are imprecise (e.g. http://www.jigzone.com/xmockup/oCanvasBug.php failed in Opera because 79.01 79 as 64-bit double, whereas other browsers presumably rounded to 32-bit float first), so it had to be changed. (http://html5.org/r/5373 first, then changed again because of http://www.w3.org/Bugs/Public/show_bug.cgi?id=10799 to be consistent with filtering behaviour.) A bit lower down in the same section, the spec says: When the filtering algorithm requires a pixel value from outside the original image data, it must instead use the value from the nearest edge pixel. (That is, the filter uses 'clamp-to-edge' behavior.) The clamp-to-edge behavior doesn't really work well with Coregraphics' drawImage call. This means that this behaviour is not implemented in Firefox on OS X and I expect WebKit doesn't implement it for a similar reason. I was actually hoping the spec could be changed to the simpler behaviour of just clamping the source rectangle to the bounds of the image. This behaviour is easy to implement on all platforms and is still quite reasonable. Does the clamp-to-edge behaviour work fine when the source rectangle is entirely inside the image? e.g. the image 8800 8800 0088 0088 (where each digit is a pixel) drawn at 2x scale with bilinear filtering should give 88862000 88862000 88862000 66653222 22235666 00026888 00026888 00026888 because of the filtering requirements. If CoreGraphics can't do that then it's broken (per the spec) regardless of how source rectangles are handled. Or is it able to do clamp-to-edge fine up to the edge of the source image, just not extend that beyond the image when the source rectangle is expanded further? -- Philip Taylor exc...@gmail.com
Re: [whatwg] Accept full CSS colors in the legacy color parsing algorithm
On Fri, Apr 8, 2011 at 10:26 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 4/8/11 1:54 PM, Tab Atkins Jr. wrote: In the legacy color parsing algorithm [...] Could we change those two steps to just say If keyword is a valid CSS color value, then return the simple color corresponding to that value.? (I guess, to fully match Webkit, you need to change the definition of simple color to take alpha into account.) Do you have web compat data here? I don't know if this is relevant or useful but anyway: http://philip.html5.org/data/font-colors.txt has some basic data for font color values, http://philip.html5.org/data/bgcolors.txt for body bgcolor. (Each line is the number of URLs that value was found on (from the set from http://philip.html5.org/data/dotbot-20090424.txt), followed by the XML-encoded value.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas gradients color interpolation - change to premultiplied?
On Tue, Nov 23, 2010 at 8:43 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: Right now, canvas gradients interpolate their colors in non-premultiplied space; that is, the raw values of r, g, b, and a are interpolated independently. This has the unfortunate effect that colors darken as they transition to transparent, as transparent is defined as rgba(0,0,0,0), a transparent black. Under this scheme, the color halfway between yellow and transparent is rgba(127,127,0,.5), a partially-transparent dark yellow, rather than rgba(255,255,0,.5).* If you define the gradient as interpolating from solid yellow to transparent black, I'd expect that it *should* be semi-transparent blackish-yellow in the middle. If you want it to be pure yellow, don't use a keyword which is explicitly specified as transparent black - define the gradient from rgba(255,255,0,1) to rgba(255,255,0,0) instead. Then you'll get rgba(255,255,0,0.5) in the middle. The rest of the platform has switched to using premultiplied colors for interpolation, because they react better in cases like this**. CSS transitions and CSS gradients now explicitly use premultiplied colors, and SVG ends up interpolating similarly (they don't quite have the same problem - they track opacity separate from color, so transitioning from color:yellow;opacity:1 to color:yellow;opacity:0 gives you color:yellow;opacity:.5 in the middle, which is the moral equivalent of rgba(255,255,0,.5)). That sounds like SVG gradients *can't* be using premultiplied colours. A transition from color:yellow;opacity:1 to color:black;opacity:0 will have rgba(127,127,0,0.5) in the middle, and it's impossible to get that if you are using premultiplied colours. You'd have to have A=1 at the start and A=0 at the end, so (with premultiplied colour) the end would be interpreted as rgba(0,0,0,0), so you'd get the same as interpolating to color:yellow;opacity:0 (i.e. rgba(255,255,0,0.5) in the middle), which is not what SVG does. http://www.w3.org/TR/SVGTiny12/painting.html#Gradients says explicitly its behaviour is the non-premultiplied behaviour we currently get with canvas. (gradient from fully transparent red, via partly transparent dark yellow, to fully opaque lime - the RGB components of fully transparent colours are preserved.) Maybe CSS should have originally used the keyword transparentblack instead of transparent (though the distinction didn't matter before gradients existed) - changing the gradient algorithm solely to work more intuitively when people happen to use that one particular incorrectly-named keyword seems backwards, and a mistake in CSS. (Perhaps CSS gradients could avoid this problem by overriding the meaning of the transparent keyword, so that instead of rgba(0,0,0,0) it means A=0 with the mean RGB of the adjacent colour stops. That would let it work as people naturally expect when they use that keyword, and they can use the rgba() syntax if they really want transparent black or transparent yellow or transparent red etc.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Question about gradient stops in canvas and parsing as CSS colors
On Wed, Sep 22, 2010 at 3:49 PM, Anne van Kesteren ann...@opera.com wrote: On Wed, 22 Sep 2010 16:47:02 +0200, Boris Zbarsky bzbar...@mit.edu wrote: Clearly I happen to think Gecko's behavior is the sane one here, but there's a clear interoperability problem either way. Certainly Opera and Gecko interpreted the spec differently. Might be the way we invoke the CSS code. I think the Gecko behavior makes sense. Philip, can your test suite cover this? Added with the Gecko behaviour (and added a few other cases - Opera 10.61 fails some like rgba-solid-3): http://dvcs.w3.org/hg/html/diff/5a95d6481bac/tests/submission/PhilipTaylor/tools/canvas/tests2d.yaml Run the tests from http://test.w3.org/html/tests/submission/PhilipTaylor/canvas/index.2d.fillStyle.parse.html or (if that's down) http://dvcs.w3.org/hg/html/raw-file/tip/tests/submission/PhilipTaylor/canvas/index.2d.fillStyle.parse.html -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas API: What should happen if non-finite floats are used
On Wed, Sep 8, 2010 at 9:02 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 9/8/10 2:22 PM, Oliver Hunt wrote: One old case that failed in the presence of exceptions was the old canvex demo at http://canvex.lazyilluminati.com/83/play.xhtml - this was one of the first cases i saw after trying to make webkit's implementation conform to the (older) spec by throwing exceptions on non-finite values we had many canvas using sites break so had to stop throwing. OK. I can believe that this was the case at the time, but it certainly wasn't due to Firefox not throwing. I can see how given people's penchant to create browser-specific content changing the webkit behavior could cause issues with sites that were targeting only webkit and didn't bother testing in anything else. Canvex was originally written for and tested in Firefox 1.5/2.0 and Opera 9. It wasn't tested in Safari (due to lack of Mac). I think the relevant bug is https://bugs.webkit.org/show_bug.cgi?id=13537 which was actually caused by passing 0 sizes to drawImage, not by non-finite values. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas feedback (various threads)
On Wed, Aug 11, 2010 at 9:35 PM, Ian Hickson i...@hixie.ch wrote: On Thu, 29 Jul 2010, Gregg Tavares (wrk) wrote: source-over glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA); I tried searching the OpenGL specification for either glBlendFunc or GL_ONE_MINUS_SRC_ALPHA and couldn't find either. Could you be more specific regarding what exactly we would be referencing? I'm not really sure I understand you proposal. The OpenGL spec omits the gl/GL_ prefixes - search for BlendFunc instead. (In the GL 3.0 spec, tables 4.1 (the FUNC_ADD row) and 4.2 seem relevant for defining the blend equations.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Default value of complete attribute on new Image objects
On Wed, Aug 11, 2010 at 12:23 AM, Ian Hickson i...@hixie.ch wrote: I've updated the spec to have complete return true if the src is the empty string. Some canvas methods (drawImage, createPattern) are defined in terms of the complete attribute (If the image argument is an HTMLImageElement object whose complete attribute is false, [...] then the implementation must return without drawing anything.). Now that it can be true when the image doesn't have any image data, what should they do when passed such an image? -- Philip Taylor exc...@gmail.com
Re: [whatwg] Default value of complete attribute on new Image objects
On Wed, Aug 11, 2010 at 1:06 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Aug 10, 2010 at 4:56 PM, Philip Taylor excors+wha...@gmail.com wrote: On Wed, Aug 11, 2010 at 12:23 AM, Ian Hickson i...@hixie.ch wrote: I've updated the spec to have complete return true if the src is the empty string. Some canvas methods (drawImage, createPattern) are defined in terms of the complete attribute (If the image argument is an HTMLImageElement object whose complete attribute is false, [...] then the implementation must return without drawing anything.). Now that it can be true when the image doesn't have any image data, what should they do when passed such an image? Isn't the image fully loaded, just empty? Depends how you define the concept of fully loaded, I guess. The spec says an empty src is invalid and triggers an error event and makes the image not available (but now also complete), so it's not entirely the same as a normal non-empty image. Seems like drawing such an image should act normal. It just so happens that normal for an empty image would be to draw nothing? Just have to avoid divide-by-zero errors when creating patterns :) Probably should do the same as a 0-pixel canvas (If the image argument is an HTMLCanvasElement object with either a horizontal dimension or a vertical dimension equal to zero, then the implementation must raise an INVALID_STATE_ERR exception.). (The spec currently assumes complete HTMLImageElements always have non-zero size, so the dimension check isn't applied to them.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] HTML resource packages
On Wed, Aug 4, 2010 at 1:31 AM, Justin Lebar justin.le...@gmail.com wrote: We at Mozilla are hoping to ship HTML resource packages in Firefox 4, and we wanted to get the WhatWG's feedback on the feature. For the impatient, the spec is here: http://people.mozilla.org/~jlebar/respkg/ It seems a bit surprising that [pkg.zip img1.png img2.png] provides more files than [pkg.zip img1.png] but *fewer* files than [pkg.zip] (which includes all files). I can imagine people would write code like: print html packages='[cached-image-thumbnails.zip . (join , @thumbnails_which_are_not_out_of_date) . ]'; (intending the package to be updated infrequently, and used only for images that haven't been modified since the last package update), and they would get completely the wrong behaviour when the list is empty. So maybe [pkg.zip] should mean no files (vs pkg.zip which still means all files). Filenames in zips are byte-strings, not Unicode-character-strings. What should happen with non-ASCII in the zip's list of contents? People will use standard zip programs and frequently end up with various random character encodings in their file - would browsers guess or decode as CP437 or decode as UTF-8 or fail? would they look at the zip header's language encoding flag? etc. What happens if the document contains multiple html elements (not all the root element)? (e.g. if it's XHTML, or the elements are added by scripts). The packages spec seems to assume there is only ever one. The note at the end of 4.1 seems to be about avoiding problems like http://evil.com/ saying: html packages=eviloverride.zip !-- gets downloaded from evil.com -- base href=http://bank.com/; img src=http://bank.com/logo.png; !-- this shouldn't be allowed to come from the .zip -- Why is this particular example an important problem? If the attacker wants to insert their own files into their own pages, they can just do it directly without using packages. Since this is (I assume) only used for resources like images and scripts and stylesheets, and not for a hrefs or iframe hrefs, I don't see how it would let the attacker circumvent any same-origin restrictions or do anything else dangerous. The opposite way seems more dangerous, where evil.com says: html packages=http://evil.com/redirect.cgi?http://secret-bank-intranet-server/packages.zip; img src=http://evil.com/logo.png; !-- now use canvas to read the pixel data of the secret logo, since it was loaded from the evil.com origin -- Is anything stopping that? In 4.3 step 2: What is pkg-url initialised to? (The package href of p?) -- Philip Taylor exc...@gmail.com
Re: [whatwg] HTML resource packages
On Wed, Aug 4, 2010 at 9:01 PM, Justin Lebar justin.le...@gmail.com wrote: What happens if the document contains multiple html elements (not all the root element)? (e.g. if it's XHTML, or the elements are added by scripts). The packages spec seems to assume there is only ever one. The packages attribute should work like the manifest attribute currently works. I don't see language in the cache manifest section of HTML5 (6.6) specifying what happens when there are multiple html elements, so I hope I don't need to specify this either. :) http://whatwg.org/html#attr-html-manifest says: The manifest attribute only has an effect during the early stages of document load. Changing the attribute dynamically thus has no effect (and thus, no DOM API is provided for this attribute). Its effect is triggered from http://whatwg.org/html#parser-appcache (html token in the before html insertion mode) or from http://whatwg.org/html#read-xml , so it will only ever run for the root html element of the document. The packages attribute is defined as running Whenever the packages attribute is changed (including when the document is first loaded, if its html element has a packages attribute), so it's not the same. If you do want it to work the same then you'll need to hook into the parser and ignore dynamic updates. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Allowing in attribute values
On Thu, Jun 24, 2010 at 2:34 PM, Benjamin M. Schwartz bmsch...@fas.harvard.edu wrote: [...] HTML5 is about making a spec that matches common practice, right? In practice, no one puts in attribute values. The data disagrees: http://philip.html5.org/data/gt-in-attribute.txt -- Philip Taylor exc...@gmail.com
Re: [whatwg] WebSockets: UDP
On Thu, Jun 3, 2010 at 7:28 AM, Erik Möller emol...@opera.com wrote: [...] One thing to remember here is that browsers have other means for communication as well. I'm not saying we shouldn't support reliable messages over UDP, but just pointing out the option. Yep - the relevant use cases ought to be supported decently by the platform, but not necessarily by this extension to the platform (it might be a different extension or it might be (probably is) supported already). - Protection against an attacker initiating a legitimate socket with a user and then redirecting it (with some kind of IP (un)hijacking) to a service behind the user's firewall (which isn't a problem when using TCP since the service will ignore packets when it hasn't done the TCP handshake; but UDP services might respond to a single packet from the middle of a websocket stream, so every single packet will have to be careful not to be misinterpreted dangerously by unsuspecting services). I don't quite follow what you mean here. Can you expand on this with an example? I was thinking something like: A host at IP 11.11.11.11 on the public internet runs some UDP service, like DNS or TFTP or something a bit more secure. That service is restricted so it only responds to packets received from IP 22.22.22.22 (a trusted user). The UDP Web Socket handshake is carefully constructed so that it won't trigger dangerous behaviour in any of those services (like how the TCP Web Socket uses a safe HTTP-ish handshake). An attacker hijacks the IP 11.11.11.11 from the perspective of the user (by advertising new routes near the user), so the user's packets to that address go to the attacker. The attacker gets the user to visit a web page which sets up a UDP Web Socket with the attacker's server at 11.11.11.11, doing all the handshake authentication correctly. The attacker then releases its hijacked address, so any subsequent Web Socket packets will go to the original restricted service. Since they're being received from the trusted user, the service will trust them. Since the web browser has already done the Socket handshake, it will believe it's talking to a legitimate Web Socket server and will continue sending whatever data packets the attacker's script tells it to. The service will then be receiving and responding to attacker-controlled packets, and will never have seen the carefully constructed handshake that's designed to protect it. That's not a danger for TCP services since they'll reject unexpected packets from the middle of a TCP stream, but UDP services may accept packets from the middle of a UDP Web Socket stream. So it's not sufficient to carefully construct the Web Socket handshake packets to not trigger unwanted behaviour in non-Socket services. Every data packet sent on the Socket has to be carefully constructed too. (This might be a largely impractical or pointless attack, and there's probably much easier ways to attack the exposed service, but I don't know enough about security to judge that. Also I don't know what packet construction would be sufficiently careful. But it seems like a possible new concern that's introduced when using UDP in this context.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] WebSockets: UDP
On Tue, Jun 1, 2010 at 9:02 PM, Erik Möller emol...@opera.com wrote: On Tue, 01 Jun 2010 21:14:33 +0200, Philip Taylor excors+wha...@gmail.com wrote: More feedback is certainly good, though I think the libraries I mentioned (DirectPlay/OpenTNL/RakNet/ENet (there's probably more)) are useful as an indicator of common real needs (as opposed to edge-case or merely perceived needs) - they've been used by quite a few games and they seem to have largely converged on a core set of features, so that's better than just guessing. I guess many commercial games write their own instead of reusing third-party libraries, and I guess they often reimplement very similar concepts to these, but it would be good to have more reliable information about that. I was hoping to be able to avoid looking at what the interfaces of a high vs low level option would look like this early on in the discussions, but perhaps we need to do just that; look at Torque, RakNet etc and find a least common denominator and see what the reactions would be to such an interface. I'm trying to think of them mainly as indirect examples of use cases, rather than as direct examples of interfaces. Under the assumption that most games either use a library like these or implement a comparable one themselves, and that the library designs are driven by the game requirements, if a feature is supported by most of the libraries then it's probably needed by many games; and if a feature is unsupported in many of the libraries then it's probably unnecessary for most games. (Also an assumption: games running in web browsers will have similar needs to native games (though lagging many years behind state-of-the-art); and we only ought to aim to support the needs of most games, not all games.) So they seem to suggest things like: - many games need a combination of reliable and unreliable-ordered and unreliable-unordered messages. - many games need to send large messages (so the libraries do automatic fragmentation). - many games need to efficiently send tiny messages (so the libraries do automatic aggregation). - many games need some kind of security (I have no idea exactly what, or how much is still relevant when the client is JavaScript and trivial to tamper with). - many games need to prioritise certain messages when bandwidth is limited. - most games don't need low-level control over individual datagrams and precise packet loss feedback, they're okay with the socket details being abstracted away. - ... probably lots more (and/or less); I'm not very familiar with the details of the libraries so this is unlikely to be an accurate list, but I think it may be a useful way to analyse the requirements. (The solution suggested in your initial post (socket.send(data_smaller_than_mtu) going over UDP) seems to be one extreme, which combines with higher-level JS libraries to satisfy these needs. I think I initially suggested the other extreme of encoding all the features into the browser API. I guess the best tradeoff depends largely on what non-game use cases exist that should be satisfied by the same solution.) So, what would the minimal set of limitations be to make a UDP WebSocket browser-safe? -No listen sockets -No multicast -Reliable handshake with origin info -Automatic keep-alives -Reliable close handshake -Socket is bound to one address for the duration of its lifetime -Sockets open sequentially (like current DOS protection in WebSockets) -Cap on number of open sockets per server and total per user agent Perhaps also: - Cap or dynamic limit on bandwidth (you don't want a single web page flooding the user's network connection and starving all the TCP connections) - Protection against session hijacking - Protection against an attacker initiating a legitimate socket with a user and then redirecting it (with some kind of IP (un)hijacking) to a service behind the user's firewall (which isn't a problem when using TCP since the service will ignore packets when it hasn't done the TCP handshake; but UDP services might respond to a single packet from the middle of a websocket stream, so every single packet will have to be careful not to be misinterpreted dangerously by unsuspecting services). -- Philip Taylor exc...@gmail.com
Re: [whatwg] WebSockets: UDP
On Tue, Jun 1, 2010 at 11:12 AM, Erik Möller emol...@opera.com wrote: The use case I'd like to address in this post is Real-time client/server games. The majority of the on-line games of today use a client/server model over UDP and we should try to give game developers the tools they require to create browser based games. For many simpler games a TCP based protocol is exactly what's needed but for most real-time games a UDP based protocol is a requirement. [...] It seems to me the WebSocket interface can be easily modified to cope with UDP sockets [...] As far as I'm aware, games use UDP because they can't use TCP (since packet loss shouldn't stall the entire stream) and there's no alternative but UDP. (And also because peer-to-peer usually requires NAT punchthrough, which is much more reliable with UDP than with TCP). They don't use UDP because it's a good match for their requirements, it's just the only choice that doesn't make their requirements impossible. There are lots of features that seem very commonly desired in games: a mixture of reliable and unreliable and reliable-but-unordered channels (movement updates can be safely dropped but chat messages must never be), automatic fragmentation of large messages, automatic aggregation of small messages, flow control to avoid overloading the network, compression, etc. And there's lots of libraries that build on top of UDP to implement protocols halfway towards TCP in order to provide those features: http://msdn.microsoft.com/en-us/library/bb153248(VS.85).aspx, http://opentnl.sourceforge.net/doxydocs/fundamentals.html, http://www.jenkinssoftware.com/raknet/manual/introduction.html, http://enet.bespin.org/Features.html, etc. UDP sockets seem like a pretty inadequate solution for the use case of realtime games - everyone would have to write their own higher-level networking libraries (probably poorly and incompatibly) in JS to provide the features that they really want. Browsers would lose the ability to provide much security, e.g. flow control to prevent intentional/accidental DOS attacks on the user's network, since they would be too far removed from the application level to understand what they should buffer or drop or notify the application about. I think it'd be much more useful to provide a level of abstraction similar to those game networking libraries - at least the ability to send reliable and unreliable sequenced and unreliable unsequenced messages over the same connection, with automatic aggregation/fragmentation so you don't have to care about packet sizes, and dynamic flow control for reliable messages and maybe some static rate limit for unreliable messages. The API shouldn't expose details of UDP (you could implement exactly the same API over TCP, with better reliability but worse latency, or over any other protocols that become well supported in the network). -- Philip Taylor exc...@gmail.com
Re: [whatwg] WebSockets: UDP
On Tue, Jun 1, 2010 at 2:00 PM, Erik Möller emol...@opera.com wrote: [...] I've never heard any gamedevs complain how poorly UDP matches their needs so I'm not so sure about that, but you may be right it would be better to have a higher level abstraction. If we are indeed targeting the game developing community we should ask for their feedback rather than guessing what they prefer. I will grep my linked-in account for game-devs tonight and see if I can gather some feedback. More feedback is certainly good, though I think the libraries I mentioned (DirectPlay/OpenTNL/RakNet/ENet (there's probably more)) are useful as an indicator of common real needs (as opposed to edge-case or merely perceived needs) - they've been used by quite a few games and they seem to have largely converged on a core set of features, so that's better than just guessing. I guess many commercial games write their own instead of reusing third-party libraries, and I guess they often reimplement very similar concepts to these, but it would be good to have more reliable information about that. I suspect they prefer to be empowered with UDP rather than boxed into a high level protocol that doesn't fit their needs but I may be wrong. If you put it like that, I don't see why anybody would not want to be empowered :-) But that's not the choice, since they could never really have UDP - the protocol will perhaps have to be Origin-based, connection-oriented (to exchange Origin information etc), with complex packet headers so you can't trick it into talking to a DNS server, with rate limiting in the browser to prevent DOS attacks, restricted to client-server (no peer-to-peer since you probably can't run a socket server in the browser), etc. Once you've got all that, a simple UDP-socket-like API might not be the most natural or efficient way to implement a higher-level partially-reliable protocol - the application couldn't cooperate with the low-level network buffering to prioritise certain messages, it couldn't use the packet headers that have already been added on top of UDP, it would have to send acks from a script callback which may add some latency after a packet is received from the network, etc. So I think there's some tradeoffs and it's not a question of one low-level protocol vs one strictly more restrictive higher-level protocol. So the question to the gamedevs will be, and please make suggestions for changes and I'll do an email round tonight: If browser and server vendors agree on and standardize a socket based network interface to be used for real-time games running in the browsers, at what level would you prefer the interface to be? (Note that an interface for communicating reliably via TCP and TLS are already implemented.) - A low-level interface similar to a plain UDP socket - A medium-level interface allowing for reliable and unreliable channels, automatically compressed data, flow control, data priority etc - A high-level interface with ghosted entities That first option sounds like you're offering something very much like a plain UDP socket (and I guess anyone who's willing to write their own high-level wrapper (which is only hundreds or thousands of lines of code and not a big deal for a complex game) would prefer that since they want as much power as possible), but (as above) I think that's misleading - it's really a UDP interface on top of a protocol that has some quite different characteristics to UDP. So I think the question should be clearer that the protocol will necessarily include various features and restrictions on top of UDP, and the choice is whether it includes the minimal set of features needed for security and hides them behind a UDP-like interface or whether it includes higher-level features and exposes them in a higher-level interface. -- Philip Taylor exc...@gmail.com
Re: [whatwg] What will not work when we do not have server ?
On Mon, Mar 29, 2010 at 4:27 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Mon, Mar 29, 2010 at 7:05 AM, narendra sisodiya narendra.sisod...@gmail.com wrote: Dear all, I am making a (uff from long time) some e-learning modules using HTML5. The idea is just to make a full interactive lectures (audio, video, svg animations , JavaScript, canvas , all sort of new good web technologies etc ), But there is a little problem. Student will be able to download as a zip file. When they want to watch those html5 based interactive tutorials, all they need to click on index.html which will open the tutorial. I want to ask that what will not work in this mode. for example, I have cheked that some basic jQuery ajax demos are working well in both url http://localhost/narendra/demo.html OR file:///var/www/narenda/demo.html I want to know the list for all the such drafts which will not work without server. So that I will avoid them Or try to get some workaround. Anything that requires a server-side language (PHP, ASP, Python, Ruby, etc.) won't work. Anything that requires only client-side languages (HTML, CSS, Javascript) will. But you also need to be careful about security rules for file:// differing from http://, e.g. Firefox 3 apparently considers files in parent directories to be non-same-origin, so you can't use XMLHttpRequest to get ../foo/bar.txt, and if you have an img src=../images/example.png and draw it on a canvas then you won't be able to call toDataURL or getImageData, whereas it would be fine if the files were on an http:// site. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Offscreen canvas (or canvas for web workers).
On Mon, Mar 15, 2010 at 7:05 AM, Maciej Stachowiak m...@apple.com wrote: Copying from one canvas to another is much faster than copying to/from ImageData. To make copying to a Worker worthwhile as a responsiveness improvement for rotations or downscales, in addition to the OffscreenCanvas proposal we would need a faster way to copy image data to a Worker. One possibility is to allow an OffscreenCanvas to be copied to and from a background thread. It seems this would be much much faster than copying via ImageData. Maybe this indicates that implementations of getImageData/putImageData ought to be optimised? e.g. do the expensive multiplications and divisions in the premultiplication code with SIMD. (A seemingly similar thing at http://bugzilla.openedhand.com/show_bug.cgi?id=1939 suggests SSE2 makes things 3x as fast). That would avoid the need to invent new API, and would also benefit anyone who wants to use ImageData for other purposes. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Multiple file download
On Wed, Mar 10, 2010 at 5:51 PM, Eric Uhrhane er...@google.com wrote: On Wed, Mar 10, 2010 at 12:28 AM, timeless timel...@gmail.com wrote: http://www.pkware.com/documents/casestudies/APPNOTE.TXT V. General Format of a .ZIP file the zip format is fairly streaming friendly, the directory is at the end of the file. And if you're actually generating a file which has so many records that you can't remember all of them, you're probably trying to attack my user agent, so I'm quite happy that you'd fail. Isn't a format that has its directory at the end about as streaming-UNfriendly as you can get? You need to pull the whole thing down before you can take it apart. With a .tar.gz, you can unpack files as they arrive. Each file's compressed data is preceded with a header with enough information to decompress it (filename etc), and then that information is duplicated in the central directory at the end, so I believe you can still do streaming decompression (as well as doing random access once you've got the directory). And you can still do streaming compression without even buffering a single file, by setting a flag and moving a part of the file header (lengths and checksum) to just after the compressed file data. (But I never understood why pkunzip asked me to put in the last floppy disk of a multi-disk zip before it would start decompressing the first - maybe there's some reason that streaming decompression doesn't quite work perfectly in practice?) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Parsing processing instructions in HTML syntax: 10.2.4.44 Bogus comment state
On Wed, Mar 3, 2010 at 10:55 AM, Brett Zamir bret...@yahoo.com wrote: On 3/2/2010 6:54 PM, Ian Hickson wrote: On Tue, 2 Mar 2010, Elliotte Rusty Harold wrote: Briefly it seems that? causes the parser to go into Bogus comment state, which is fair enough. (I wouldn't really recommend that anyone use processing instructions in HTML syntax anyway.) However the parser comes out of that state at the first. Because processing instructions can contain and terminate only at the two character sequence ? this could cause PI processing to terminate early and leave a lot more error handling and a confused parser state in the text yet to come. In HTML4, PIs ended at the first, not at ?. ?target data is the syntax of PIs when the SGML options used by HTML4 are applied. In any case, the parser in HTML5 is based on what browsers do, which is also to terminate at the first. It's unlikely that we can change that, given backwards-compatibility needs. Are there really a lot of folks out there depending on old HTML4-style processing instructions not being broken? Yes, e.g. a load of pages like http://www.forex.com.cn/html/2008-01/821561.htm (to pick one example at random) say: ?xml:namespace prefix = o ns = urn:schemas-microsoft-com:office:office / and don't have the string ? anywhere. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Feature proposal - add method to CanvasRenderingContext2D
On Wed, Mar 3, 2010 at 1:08 PM, František Řezáč frantisek.re...@calavera.info wrote: Description add overload of (or add similarly called) method createImageData to interface CanvasRenderingContext2D which would take two arguments: - encodedImageBinaryData - dataMimeType which are rather self explanatory. Reason The reason is to be able to supply output of the future File API standard (http://www.w3.org/TR/FileAPI/) into canvas. The canvas API already lets you do: var img = new Image(); img.onload = function() { ctx.drawImage(img, 0, 0); // do processing on the canvas }; img.src = 'data:image/png;base64,...'; // get this string from readAsDataURL etc Is that sufficient for your use case? -- Philip Taylor exc...@gmail.com
Re: [whatwg] Error: Stray doctype.
On Fri, Feb 12, 2010 at 4:08 PM, Dean Edwards dean.edwa...@gmail.com wrote: http://html5.validator.nu/?doc=http://www.whatwg.org/specs/web-apps/current-work/multipage/ Oops, looks like a consequence of moving the multipage script to a server with a different version of lxml. Fixed. -- Philip Taylor exc...@gmail.com
Re: [whatwg] some thoughts on sandboxed IFRAMEs
On Thu, Feb 4, 2010 at 11:12 AM, Ian Hickson i...@hixie.ch wrote: On Mon, 25 Jan 2010, Alex Russell wrote: AFAICT, the objections fall into several buckets: 1.) Users might pick badly or may re-use nonces when they shouldn't. 2.) Escaping is believed to be more secure because it's likely to break more often, raising developer awareness 3.) The fix to correct escaping problems is believed to be more reliable I'm interested in 2 and 3. Users will do dumb things, and both 2 and 3 assumes a similar baseline scenario as 1; a developer did something dumb. Nonces need not be cryptographically strong for most apps, so the big problem is re-use. UA's have broad leeway here to prevent re-use on origins and deny sandboxing to containers that re-use the same nonces on a single page. They can even help by keeping a list of recently used nonces and denying reuse. Could you elaborate on how one could avoid reuse? That seems like a bad idea, since it would prevent any non-client caching mechanism from working. The problem is not nonce re-use, it's that the token has to be either unpredictable or unspoofable. (It could be predictable and unspoofable if it was constructed using a diagonal of the user's text.) Seems like it should be easy to get secure tokens by doing: $token = sha512_hex($input); print sandbox token=$token$input/sandbox token=$token; (or whatever the sandbox syntax is), so there's no need to worry about cryptographically secure RNGs or nonces or reuse or caching problems. Is this what you meant by a diagonal of the user's text? (I'm assuming here that the UA treats the token as an opaque blob, it doesn't try to recompute the hash itself, so it's robust to changes in character encoding etc. People could still choose insecure tokens instead, but it's pretty trivial to use the hash solution correctly in most programming environments (easier than good random numbers). To attack it, you'd have to pick two strings X and Y and a hash H such that hash(X+/sandbox token=+H++Y) = H, which for a good hash function should be hard, I think.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] HTMLInputElement::valueAsNumber and NaN Infinity
On Mon, Jan 25, 2010 at 9:55 AM, TAMURA, Kent tk...@chromium.org wrote: It seems the current spec doesn't define behavior in a case of setting NaN or Infinitiy to HTMLInputElement::valueAsNumber. http://whatwg.org/html5#float-nan : Except where otherwise specified, if an IDL attribute that is a floating point number type (float) is assigned an Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR exception must be raised. This case seems to apply for valueAsNumber. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas pixel manipulation and performance
On Mon, Nov 30, 2009 at 4:46 PM, Kenneth Russell k...@google.com wrote: CanvasPixelArray specifies that values greater than 255, including +inf, are clamped to 255 and values less than 0, including -inf, are clamped to zero. WebGLUnsignedByteArray (as people will see in the WebGL draft spec this week or next) specifies that the conversion is done with a C-style cast. The results are different for out-of-range values. I was going to say: It doesn't include +/-inf, because http://whatwg.org/html5#dependencies says if a method with an argument that is a floating point number type (float) is passed an Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR exception must be raised, and that probably applies to the CanvasPixelArray setter method. But it looks like the spec changed since I last looked, and the setter takes an 'octet' argument, so I think the conversion should happen as per http://dev.w3.org/2006/webapi/WebIDL/#es-octet and CanvasPixelArray shouldn't define any conversion. (Filed as http://www.w3.org/Bugs/Public/show_bug.cgi?id=8405). Hopefully WebIDL and WebGL either match or can be made to match. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas pixel manipulation and performance
On Sun, Nov 29, 2009 at 6:59 PM, Kenneth Russell k...@google.com wrote: On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote: Are they even byte stores, necessarily? I know in Gecko imagedata is just a JS array at the moment; it stores each of R,G,B,A as a JS Number (with the usual if it's an integer store as an integer optimization arrays do). That might well change in the future, and I hope it does, but that's the current code. I can't speak to what the behavior is in Webkit, and in particular whether it's even the same when using V8 vs Nitro. In Chromium (WebKit + V8), CanvasPixelArray property stores write individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray behave similarly but have simpler clamping semantics. Would it be helpful (for simplicity or performance or consistency etc) to change the specification of CanvasPixelArray to have those simpler clamping semantics? (I don't expect there would be compatibility problems with changing it now, particularly since Firefox doesn't implement clamping at all in CPA.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]
On Thu, Oct 22, 2009 at 9:23 PM, Øistein E. Andersen li...@coq.no wrote: On 22 Oct 2009, at 17:15, NARUSE, Yui wrote: Finally, Why ISO 2022 series is discouraged is not clear. We agree on this point. The string 숍訊昱穿 encoded as ISO-2022-KR is the bytes 0e 3c 73 63 72 69 70 74 3e. A UA that doesn't support ISO-2022-KR (e.g. Chrome, when I last checked) will decode it as Windows-1252 and get the string script, which is bad. So a site that uses ISO-2022-KR is very likely to expose some users to XSS attacks, which seems like a good reason to discourage that encoding. The same applies to other ISO-2022 encodings. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas Proposal: aliasClipping property
On Fri, Oct 16, 2009 at 2:41 AM, Charles Pritchard ch...@jumis.com wrote: Having gone back and forth with Robert a bit: I was able to recall the whys of a particular issue that could be handled in this version of the spec, regarding compositing. As far as I can tell; the area (width and height, extent) of source image A [4.8.11.13 Compositing] when source image A is a shape, is not defined by the spec. And so in Chrome, when composting with a shape, the extent of image A is only that width and height the shape covers, whereas in Firefox, the extent of image A is equivalent to the extent of image B (the current bitmap). This led to an incompatibility between the two browsers. I think the spec is clear on this (at least when I last looked; not sure if it's changed since then). Image A is infinite and filled with transparent black, then you draw the shape onto it (with no compositing yet), and then you composite the whole of image A (using globalCompositeOperation) on top of the current canvas bitmap. With some composite operations that's a different result than if you only composited pixels within the extent of the shapes you drew onto image A. (With most composite operations it makes no visible difference, because compositing transparent black onto a bitmap has no effect, so this only affects a few unusual modes.) There is currently no definition of what the extent of a shape is (does it include transparent pixels? shadows? what about text with a bitmap font? etc), and it sounds like a complicated thing to define and to implement interoperably, and I don't see obvious benefits to users, so the current specced behaviour (using infinite bitmaps, not extents) seems to me like the best approach (and we just need everyone to implement it). -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas Proposal: aliasClipping property
On Fri, Oct 16, 2009 at 2:25 PM, Robert O'Callahan rob...@ocallahan.org wrote: On Sat, Oct 17, 2009 at 1:06 AM, Philip Taylor excors+wha...@gmail.com wrote: I think the spec is clear on this (at least when I last looked; not sure if it's changed since then). Image A is infinite and filled with transparent black, then you draw the shape onto it (with no compositing yet), and then you composite the whole of image A (using globalCompositeOperation) on top of the current canvas bitmap. With some composite operations that's a different result than if you only composited pixels within the extent of the shapes you drew onto image A. Ah, so you mean Firefox is right in this case? Yes, mostly. http://philip.html5.org/tests/canvas/suite/tests/index.2d.composite.uncovered.html has relevant tests, matching what I believed the spec said - on Windows, Opera 10 passes them all, Firefox 3.5 passes all except 'copy' (https://bugzilla.mozilla.org/show_bug.cgi?id=366283), Safari 4 and Chrome 3 fail them all. (Looking at the spec quickly now, I don't see anything that actually states this explicitly - the only reference to infinite transparent black bitmaps is when drawing shadows. But http://www.whatwg.org/specs/web-apps/current-work/multipage/the-canvas-element.html#drawing-model is phrased in terms of rendering shapes onto an image, then compositing the image within the clipping region, so I believe it is meant to work as I said (and definitely not by compositing only within the extent of the shape drawn onto the image).) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Stripping newlines from URI attributes
On Thu, Jul 30, 2009 at 2:37 PM, Elliotte Rusty Haroldelh...@ibiblio.org wrote: On Wed, Jul 29, 2009 at 5:49 PM, Kartikaya Guptalists.wha...@stakface.com wrote: It seems that most browsers do some sort of newline and tab removal from URI attributes. For example, if you have img src=foo bar.jpg browsers will still render the image called foobar.jpg despite the CRLF pair in the middle of the src attribute. [...] This is an area where we should not attempt (and probably simply cannot) maintain compatibility with existing browsers. They're just too broken. We should attempt to maintain compatibility with existing content, and whitespace in URI attributes seems very common in existing content, e.g.: http://www.topdogphotos.com/photo-gallery/gallery11.html (newlines in a href, img src) http://www.sprig.com/coyuchi_george_or_thor_hooded_baby_towel (tabs and #xD;#xA; in img src) and loads more. -- Philip Taylor exc...@gmail.com
Re: [whatwg] the cite element
On Mon, Jul 27, 2009 at 3:20 PM, Erik Vorhese...@textivism.com wrote: On Sun, Jul 19, 2009 at 4:58 AM, Ian Hicksoni...@hixie.ch wrote: In practice, people haven't been confused between these two attributes as far as we can tell. People who use cite seem to use it for titles, and people who use cite= seem to use it for URLs. (The latter is rare.) See http://www.four24.com/; note near the top of the source: blockquote id=verse cite=John 4:24... See http://philip.html5.org/data/cite-attribute-values.txt for some data. (Looks like non-URI values are quite rare.) Also maybe relevant: see http://philip.html5.org/data/cite.txt for some older data about cite. (Looks like non-title uses are very common.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Validation
On Tue, Jul 21, 2009 at 11:22 AM, Kristof Zelechovskigiecr...@stegny.2a.pl wrote: !DOCTYPE html6 would be an abomination, unless the root element changes to html6 also :-) Also it would trigger quirks mode in many existing browsers, and in any conforming HTML5 implementation. You'd have to use something like !DOCTYPE html SYSTEM 6 as the shortest string that provides a version identifier, if you insist on putting it in the doctype. (The HTML5 doctype reflects that in practice there aren't several independent carefully-separated languages - there's just a single vaguely-defined mess called HTML, described in a range of specifications and sometimes not specified at all, implemented incrementally with various extensions and bugs and missing features in various browsers, with people writing pages that mix all the different features together. The version numbering is an artifact of the W3C's process of developing a numbered sequence of specifications, and isn't aligned with how HTML browsers or documents are usually written. If you want to check that your pages are compatible with certain browser releases, the language version number is a very bad approximation - you'd want a tool that understands what features IE10 supports (maybe some (but not all) from HTML4, some (but not all) from HTML5, some proprietary extensions, etc), and it would be misleading to think that a pure HTML-version-N validator is going to be good enough for that. Maybe you want some in-band mechanism for identifying which pages a spider should check with which rules, but then something like meta name=check-ua-compatibility content=ie=10;fx=5 seems a better solution than a language version number in the doctype; if the problem is real, it should be examined independently of these particular solutions.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] the cite element
On Wed, Jul 1, 2009 at 6:04 PM, Erik Vorhese...@textivism.com wrote: On Wed, Jul 1, 2009 at 11:49 AM, Kristof Zelechovskigiecr...@stegny.2a.pl wrote: I can imagine two reasons the CITE element cannot be defined as citing whom: 1. Existing tools may assume it contains a title. Existing tools (which I would assume follow the HTML 4.01 spec) would be mistaken in their implementation of the cite element, then: CITE: Contains a citation or reference to other sources. (See http://www.w3.org/TR/html401/struct/text.html#h-9.2.1.) Moreover, in its sample usage, the HTML 4.01 spec uses cite for more than titles. In practical usage it seems to be used for more than titles: http://philip.html5.org/data/cite.txt. (But I haven't tried working out what else it is used for, or how commonly it's used for titles.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Removing the need for separate feeds
On Fri, May 22, 2009 at 11:45 AM, Adrian Sutton adrian.sut...@ephox.com wrote: [...] Can anyone point to examples where the content is entirely hand crafted and a feed would actually make sense? Perhaps a page like http://philip.html5.org/data.html - people might want to subscribe in their feed reader to see all the exciting updates, and the markup is all hand-written. It's not at all like a blog, but maybe it's data that could be usefully represented with Atom. Currently the markup looks like: ol lia href=http://philip.html5.org/data/abbr-acronym.txt;codeabbr/code, codeacronym/code titles and contents./a !-- 2008-02-03 -- lia href=http://philip.html5.org/data/spaced-uris.txt;URIs containing spaces./a !-- 2008-02-02 -- ... /ol If I understand the spec correctly, I would have to write something like: ol li article pubdate=2008-02-03T00:00:00Z h1a href=http://philip.html5.org/data/abbr-acronym.txt; rel=bookmarkcodeabbr/code, codeacronym/code titles and contents./a/h1 /article li article pubdate=2008-02-02T00:00:00Z h1a href=http://philip.html5.org/data/spaced-uris.txt; rel=bookmarkURIs containing spaces./a/h1 /article ... /ol and then it would hopefully work. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Removing the need for separate feeds
On Fri, May 22, 2009 at 2:02 PM, Adrian Sutton adrian.sut...@ephox.com wrote: On 22/05/2009 13:32, Philip Taylor excors+wha...@gmail.com wrote: Perhaps a page like http://philip.html5.org/data.html - people might want to subscribe in their feed reader to see all the exciting updates, and the markup is all hand-written. It's not at all like a blog, but maybe it's data that could be usefully represented with Atom. There are four articles on that page - do they really update often enough to warrant anything more than just adding plain If-Modified support to feedreaders and displaying the whole page when it changes? The way I see it, there are 24 articles on the page (grouped into four categories), each published independently at separate times. There would be about a hundred if I kept that index up to date. But I'm not sure this is a very compelling example, and I can't think of any other cases where I'd possibly want to publish non-database-backed data as both HTML and Atom. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Link rot is not dangerous
On Fri, May 15, 2009 at 6:25 PM, Shelley Powers shell...@burningbird.net wrote: The most important point to take from all of this, though, is that link rot within the RDF world is an extremely rare and unlikely occurrence. That seems to be untrue in practice - see http://philip.html5.org/data/rdf-namespace-status.txt The source data is the list of common RDF namespace URIs at http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces from three years ago. Out of those 284: * 56 are 404s. (Of those, 37 end with '#', so that URI itself really ought to exist. In the other cases, it'd be possible that only the prefix+suffix URIs are meant to exist. Some of the cases are just typos, but I'm not sure how many.) * 2 are Forbidden. (Of those, 1 looks like a typo.) * 2 are Bad Gateway. * 22 could not connect to the server. (Of those, 2 weren't http:// URIs, and 1 was a typo. The others represent 13 different domains.) (For the URIs which returned Redirect responses, I didn't check what happens when you request the URI it redirected to, so there may be more failures.) Over a quarter of the most common namespace URIs don't resolve successfully today, and most of those look like they should have resolved when they were originally used, so link rot seems to be common. (Major vocabularies like RSS and FOAF are likely to exist for a long time, but they're the easiest cases to handle - we could just pre-define the prefixes rss: and foaf: and have a centralised database mapping them onto schemas/documentation/etc. It seems to me that URIs are most valuable to let any tiny group make one for their rarely-used vocabulary, and be guaranteed no name collisions without needing to communicate with a centralised registry to ensure uniqueness; but it's those cases that are most vulnerable to link rot, and in practice the links appear to fail quite often.) (I'm not arguing that link rot is dangerous - just that the numbers indicate it's a common situation rather than an extremely rare exception.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Thu, May 14, 2009 at 1:25 PM, Dan Brickley dan...@danbri.org wrote: Having HTML5-microdata -to- RDF parsers is pretty critical to having test cases that help us all understand where RDFa-Classic and HTML5 diverge. I'm very happy to see this work being done and that there are multiple implementations. As far as I can see, the main point of divergence is around URI abbreviation mechanisms. But also HTML5 might not have a notion equivalent to RDF/RDFa's bNodes construct. The sooner we have these parsers the sooner we'll know for sure. If I understand RDF correctly, the idea is that everything can be URIs, subjects and objects can instead be blank nodes, and objects can instead be literals. If we restrict literals to strings (optionally with languages), then I think all triples must follow one of these eight patterns: urn:subject urn:predicate urn:object . urn:subject urn:predicate object . urn:subject urn:predicate object@lang . urn:subject urn:predicate _:X . _:X urn:predicate urn:object . _:X urn:predicate object . _:X urn:predicate object@lang . _:X urn:predicate _:Y . These cases can be trivially mapped into HTML5 microdata as: div item link itemprop=about href=urn:subject link itemprop=urn:predicate href=urn:object /div div item link itemprop=about href=urn:subject meta itemprop=urn:predicate content=object /div div item link itemprop=about href=urn:subject meta itemprop=urn:predicate content=object lang=lang /div div item link itemprop=about href=urn:subject meta itemprop=urn:predicate item id=X /div link subject=X itemprop=urn:predicate href=urn:object meta subject=X itemprop=urn:predicate content=object meta subject=X itemprop=urn:predicate content=object lang=lang meta subject=X itemprop=urn:predicate item id=Y (There's the caveat about link and meta being moved into head in some browsers; you can replace them with a and span instead.) These aren't the most elegant ways of expressing complex structures (because they don't make much use of nesting), but hopefully they demonstrate that it's possible to express any RDF graph (that only uses string literals) by decomposing into triples and then writing as HTML with these patterns. (If all the triples using a blank node have the same subject, then you don't need to use 'id' and 'subject' because you can just nest the markup instead, I think.) With my parser (in Firefox 3.0), the output triples (sorted into a clearer order) are: http://www.w3.org/1999/xhtml/vocab#item urn:subject . http://www.w3.org/1999/xhtml/vocab#item urn:subject . http://www.w3.org/1999/xhtml/vocab#item urn:subject . http://www.w3.org/1999/xhtml/vocab#item urn:subject . urn:subject urn:predicate urn:object . urn:subject urn:predicate object . urn:subject urn:predicate object@lang . urn:subject urn:predicate _:n0 . _:n0 urn:predicate urn:object . _:n0 urn:predicate object . _:n0 urn:predicate object@lang . _:n0 urn:predicate _:n1 . which corresponds to what was desired. So, I can't see any limits on expressivity other than that literals must be strings. (But I'm not at all an expert on RDF, and I may have missed something in the microdata spec, so please let me know if I'm wrong!) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Thu, May 14, 2009 at 2:54 PM, Philip Taylor excors+wha...@gmail.com wrote: [...] urn:subject urn:predicate _:X . [...] div item link itemprop=about href=urn:subject meta itemprop=urn:predicate item id=X /div [...] So, I can't see any limits on expressivity other than that literals must be strings. Hmm, I think I'm wrong here. 'id' has to be unique, which means this pattern won't work if _:X is the object for triples with two different subjects. Additionally, there must be a chain from every blank node back to via http://www.w3.org/1999/xhtml/vocab#item, else it won't get serialised (since serialisation starts from top-level items and recurses down the correspondence chains). As a consequence of this and the previous point, it is impossible to express cycles (e.g. _:X urn:predicate _:X, or any longer cycles) unless the cycle contains . So there are these two restrictions on the shapes of expressible RDF graphs. (I can't think of any other restrictions, though...) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Tue, May 12, 2009 at 11:55 AM, Eduard Pascual herenva...@gmail.com wrote: [...] (at least for now: many RDFa-aware agents vs. zero HTML5's microdata -aware agents) HTML5 microdata parsers seem pretty trivial to write - http://philip.html5.org/demos/microdata/demo.html is only about two hundred lines to read all the data and to produce JSON and N3-serialised RDF. It shouldn't take more than a few hours to produce a similar library for other languages, including the time taken to read the spec, so the implementation cost for generic parser libraries doesn't seem like a significant problem. The cost of integration with backend RDF-based systems seems more significant - hopefully you could simply replace the frontend RDFa parser with a microdata parser and generate the same RDF triples and it would all work fine, but I don't know whether that's true in practice (because maybe the microdata syntax is too restrictive to represent the vocabularies people want to use, and so they'd have to go to lots of extra effort to create a new vocabulary). [...] there are other cases where separate values might be needed: for example using a street address for the human-readable representation of a location and the exact geographic coordinates as the machine-readable (since not all micro-data parsers can rely on Google Maps's database to resolve street addresses, you know); or using a colored name (such as lime green displayed on lime green color) as the human-readable representation of a color, and the hexcode (like #00FF00) as the machine-readable representation. You could replace span itemprop=colorlime green/span span itemprop=location1 High Street/span with meta itemprop=color content=#00FF00spanlime green/span meta itemprop=location.lat content=56.78meta itemprop=location.long content=-12.34span1 High Street/span to get the desired output. (Not particularly elegant syntax, though.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Tue, May 12, 2009 at 10:21 PM, Sam Ruby ru...@intertwingly.net wrote: On Tue, May 12, 2009 at 4:34 PM, Shelley Powers shell...@burningbird.net wrote: I would say if your fellow Google developers could understand how this all works, there is hope for others. if http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009May/0064.html Also: The instructions at http://google.com/support/webmasters/bin/answer.py?answer=146898 (and related pages) alternate between xmlns:v=http://rdf.data-vocabulary.org; and xmlns:v=http://rdf.data-vocabulary.org/; seemingly at random. (The first means that property=v:name abbreviates the bogus URI http://rdf.data-vocabulary.orgname;, if I understand correctly. The second means it's http://rdf.data-vocabulary.org/name; which is a 404. Perhaps they meant xmlns:v=http://rdf.data-vocabulary.org/#; which would point at the relevant bit of the vocabulary RDF file? Hopefully people won't actually deploy content using the inconsistent namespaces before the documentation is fixed...) (They've also got a spanstrong property=v:name and spanspan property=v:locality and some unclosed as, so it seems the documentation writers are having difficulty even writing plain HTML.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Mon, May 11, 2009 at 6:15 PM, Giovanni Gentili giovanni.gent...@gmail.com wrote: * a user (or groups of users) wants to annotate items present on a generic web page with additional properties in a certain vocabulary. for example Joe wants to gather in a blog a series of personal annotation to movies (or other type of items) present in imdb.com. [...] this option require that @subject accept: 1) ID of an element with an item attribute, in the same Document or 2) valid URL of an element with an item attribute elsewhere in the web or 3) a valid URL (ithe item is the referred document or fragment) For the RDF output, you can use link property=about href=http://subject/; to create triples whose subject is a URL. (I believe in general you can also do: meta item id=n0 link subject=n0 property=about href=http://subject/; link subject=n0 property=http://predicate1/; href=http://object1/; meta subject=n0 property=http://predicate2/; content=object2 to represent arbitrary RDF triples.) I don't think it would make sense for @subject to be a URL when generating JSON output, because there wouldn't be anywhere to represent that URL in the output structure. But there could be a convention that properties called about indicate the URLs that the item applies to, and then it would work with exactly the same markup as the RDF case. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Annotating structured data that HTML has no semantics for
On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote: One of the more elaborate use cases I collected from the e-mails sent in over the past few months was the following: USE CASE: Annotate structured data that HTML has no semantics for, and which nobody has annotated before, and may never again, for private use or use in a small self-contained community. [...] To address this use case and its scenarios, I've added to HTML5 a simple syntax (three new attributes) based on RDFa. There's a quickly-hacked-together demo at http://philip.html5.org/demos/microdata/demo.html (works in at least Firefox and Opera), which attempts to show you the JSON serialisation of the embedded data, which might help in examining the proposal. -- Philip Taylor exc...@gmail.com
[whatwg] Typo
http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#the-pattern-attribute says: For example, the following snippet: label Part number: input pattern=[0-9][A-Z]{3} name=part title=A part number is a digit followed by three uppercase letters./ /label ...could cause the UA to display an alert such as: part number is a digit followed by three uppercase letters. You cannot complete this form until the field is correct. which is missing the A in the last-but-one line. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas Shadows - Unnecessary Barrier to Entry
On Fri, Mar 27, 2009 at 11:22 PM, Charles Pritchard ch...@jumis.com wrote: [...] We've been working on Javascript / Canvas projects for two years now. We're in the process of releasing full implementations targeting the Common Runtime Language, Java AWT, ActionScript and DCOM. I'm sure you can all recognize, that these components have their own vector APIs, and that we're only sending requests through as a proxy. While we can implement everything, even the non-zero winding rule, there one part of the specification that's absolutely rotten. And that's the #shadows section. I love a shadow, I love a good looking UI, but most of these APIs do not have shadow support for shapes. Do the APIs not provide enough features so you can implement shadows yourself? e.g. Firefox uses Cairo which doesn't have any native support for shadows; but it can draw shapes onto an alpha-only surface, manually blur the pixels (if you can implement getImageData then I assume you must already have access to the raw pixels and can do the blurring efficiently), then draw the shape again, and composite everything appropriately, which results in a correct shadow implementation. I don't see what makes this fundamentally harder than implementing all the other required canvas features. [...] -- Philip Taylor exc...@gmail.com
Re: [whatwg] Historic dates in HTML5
On Thu, Mar 5, 2009 at 11:33 AM, j...@eatyourgreens.org.uk j...@eatyourgreens.org.uk wrote: [...] Bruce Lawson uses time to mark up the dates of blog posts in the HTML5 version of his wordpress templates. Is this incorrect usage of HTML5? If not, how should HTML5 blog templates work when the blog is dated from 1665 (http://pepysdiary.com) or 1894 (http://www.cosmicdiary1894.blogspot.com/)? This reminds me of the issue I had with the old img alt={description} syntax. People write software that takes some input, and outputs some markup. They want to guarantee that their markup will be valid and correctly interpreted by consumers, regardless of the input. (In the img alt case, the problem was when the input resulted in legitimate alternative (non-description) alt text that started with { and ended with }, forcing the application to add complexity to make sure its output won't be misinterpreted.) In any situation where they use time, they'd probably want to write something like: print time datetime=.$t-toISO8601Date()..$t-toLocalisedHumanReadableDate()./time; But given HTML5's restrictions against BCE years, they'd actually have to write something more like: if ($t-getYear() 0) { # (be careful not to write = 0 here) print time class=time datetime=.$t-toISO8601Date()..$t-toLocalisedHumanReadableDate()./time; } else { print span class=time.$t-toLocalisedHumanReadableDate()./span; } and make sure their stylesheets use the selector .time instead of time, to guarantee everything is going to work correctly even with unexpected input values. So the restriction adds complexity (and bugs) to code that wants to be good and careful and generate valid markup. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Historic dates in HTML5
On Thu, Mar 5, 2009 at 12:56 PM, James Graham jgra...@opera.com wrote: Philip Taylor wrote: and make sure their stylesheets use the selector .time instead of time, to guarantee everything is going to work correctly even with unexpected input values. So the restriction adds complexity (and bugs) to code that wants to be good and careful and generate valid markup. On the other hand the python datetime class doesn't seem to support years = 0 at all so consuming software written in python would have to re-implement the whole datetime module, potentially causing incompatibilities with third party libraries that expect datetimes to have year = 0. This seems like a great deal more effort than simply checking that dates are in the allowed range before serializing or consuming them in languages that do support years = 0. The Python datetime class doesn't seem to support years either, which HTML5 allows. So Python consumers will already have to do if not year = : discard this time element since I'm not going to be able to do anything with it, and it's easy for them to change that to if not 1 = year = : That seems less effort than adding checks into the producers. If there is a desire that any valid HTML5 date-time string should be representable in Python's datetime class, then HTML5 should limit it to 4 digits and refuse to parse anything longer. If so, why Python's datetime in particular? The C++ Boost.Date_Time (http://www.boost.org/doc/libs/1_38_0/doc/html/date_time/gregorian.html) is apparently limited to 1400-Jan-01 to -Dec-31. Perl DateTime and PHP DateTime and Java joda-time (http://joda-time.sourceforge.net/field.html) seem happy with a range of millions of years in both directions. I'm not sure about any other libraries. The range 1.. seems pretty arbitrary since it only matches Python, and 1..inf doesn't match anything, so neither seems particularly justified by implementations. -- Philip Taylor exc...@gmail.com
Re: [whatwg] proposed canvas 2d API additions
On Sat, Feb 28, 2009 at 8:38 PM, JustFillBug mozbug...@yahoo.com.au wrote: On 2006-04-26, Ian Hickson i...@hixie.ch wrote: On Mon, 24 Apr 2006, Vladimir Vukicevic wrote: We can always add isPointInStrokedPath if we ever want to bother with that (which is where the ...Fill bit came from in my API, because the region covered by a stroked path and that covered by a filled path are different, even though testing for a hit against a filled region would by far be the common case). We can also call the other one isPointOnPath(), if we want to keep the method names reasonably short. I'm not sure we'll ever need to add it, though. Getting people to click on a line is generally silly. (Or maybe we could add a convertStrokeToPath() function, which replaces the current path with a path representing the outline of what you'd get if you stroked the current path, and then use isPointInPath on it.) We do have a need of isPointOnPath() for editing Bezier lines interactively (on a font editing interface). [...] Doing point on curve in javascript is painful. And since checking isPointInPath() already need to detect the on edge case, this shouldn't be too much a burdern on the browser developers. So please conside add the isPointOnPath() call to the function. What makes it painful? If you're only using Beziers, it doesn't seem too hard to approximate the curves as line segments and then calculate distances from that. http://philip.html5.org/demos/canvas/bezier-approx.html is fairly straightforward (and a much more accurate version shouldn't be much more complex) and can detect when your mouse is near a curve. (But if this is a common problem, it would indeed be nicer if the canvas API provided the functionality instead of forcing you to reimplement it.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas arcTo all points on a line
On Sat, Dec 27, 2008 at 9:37 AM, Dirk Schulze vb...@gmx.de wrote: Hi, have two questions to the all points on a line part of canvas' arcTo. A short example: moveTo(50,0); arcTo(100,0, 0,0, 10); This should add a new, from p1 infinite far away, point to the subpath and draw a straight line to it. Two questions. 1) If I add lineTo(50, 50); after arcTo(..). Wouldn't it draw a quasi parallel line to the line of arcTo? Because (Xx, Yx) (mentioned in the spec) is infinite far away. That means, we will never reach this point in reality. It should draw a really parallel line, with one end at (50,50) and the other end infinitely far away in the direction determined by the arcTo. 2) We don't allow infinite values for moveTo or lineTo, but can make this happen with arcTo. The example above would be the same as lineTo(-Infinite, 0); But we can make moveTo(-Infinite, 0) too with the example above. Just make strokeStyle transparent, use arcTo from above and you're done. And moveTo(infinite, infinite); would be possible too. You can moveTo(-1e+300, 0) and moveTo(1e+300, 2e+300), which are much more similar to what arcTo is meant to do. Considering the general case where the arcTo's points are not perfectly horizontal, the idea is that the point is not simply a point with coordinates (+/-Infinity, +/-Infinity) - it's really the (theoretical) limit of a point with coordinates (x+dx*t, y+dy*t) as t approaches infinity, where x,y,dx,dy represent the position/direction of the (x1,y1)--(x2,y2) line. Where the spec says (x∞, y∞) is the point that is infinitely far away from (x1, y1), that lies on the same line as (x0, y0), (x1, y1), and (x2, y2), you could read it as ...the point that is very very far away from ..., e.g. take the (x1,y1)--(x2,y2) line and then move 1e+100 units in that direction, and it would be good enough that nobody would notice the tiny error. You already have to handle something very similar to this case, because (x2,y2) might be very very close to the line (x0,y0)--(x1,y1), which means the start/end tangent points will be very very far away in the appropriate direction. The special case where (x2,y2) is precisely on the line is not really special - the points are just even further (infinitely far) away in that direction. As a concrete example: see http://philip.html5.org/demos/canvas/arcto-inf.html, which I believe should have output like http://philip.html5.org/demos/canvas/arcto-inf.png (from Safari 3.0.4 for Windows). As (x2,y2) gets closer to the line of the first two points, the start/end tangent points are pushed further over to the left. When y2=0.1 they're far enough away that the two straight lines are nearly horizontal; when y2=0 it's basically the same, except now they're precisely horizontal. So I think the spec's behaviour makes sense from a theoretical perspective, because it avoids any discontinuities in the output when the input variables are changed a tiny bit. And it made sense from a practical perspective, because it matched the behaviour of Safari 3.0 (though apparently things have changed in 3.1). But I don't know if it makes sense from the perspective of someone who's got to write an independent implementation of it. Does the above explanation make more sense than the text in the spec? and if so, does it seem implementable? If so, it seems best to keep the spec's behaviour and try to clarify the spec's text. But this doesn't seem like an important case where users will be unhappy if e.g. the arcTo call draws nothing when all the points are on the same line, so if it's still a pain to implement the spec's behaviour then I would be happy with changing what the spec requires. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas arcTo all points on a line
On Wed, Jan 21, 2009 at 2:45 PM, Philip Taylor excors+wha...@gmail.com wrote: On Sat, Dec 27, 2008 at 9:37 AM, Dirk Schulze vb...@gmx.de wrote: Hi, have two questions to the all points on a line part of canvas' arcTo. A short example: moveTo(50,0); arcTo(100,0, 0,0, 10); This should add a new, from p1 infinite far away, point to the subpath and draw a straight line to it. [...] After some discussion on IRC, it seems this part of the spec is not a great idea. As I understand it, the low-level graphics APIs have limited coordinate range and rely on the User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations. clause (and common sense) to let them have undefined behaviour when people use really large coordinate values. The infinitely-distant point required by arcTo is a really large coordinate value, but we don't want this case to be undefined behaviour (because it can occur with nice small integer input values and people might accidentally use it). Implementing the behaviour currently in the spec (with the infinitely-distant point) is not trivial, because it requires code unique to that special case (rather than falling naturally out of an implementation of the rest of arcTo's behaviour) and has to be careful to act enough like an infinitely-distance point while remaining within the implementation limits. And it seems like a rare edge case where people disagree on whether the output is sensible, and nobody is really going to care what the output is (as long as it's well defined); so it doesn't seem worthwhile having everyone understand and implement the non-trivial behaviour that's in the spec. So, in the interest of having something that implementors are more likely to converge on, I'd suggest replacing the behaviour in that case (the the direction from (x0, y0) to (x1, y1) is the opposite of the direction from (x1, y1) to (x2, y2) case) with simply drawing a straight line from (x0, y0) to (y1, y1), which is easy and apparently is what Safari on OS X already does. It's also the same as the other case in that paragraph, so the whole paragraph can be collapsed to: Otherwise, if the points (x0, y0), (x1, y1), and (x2, y2) all lie on a single straight line, then the method must add the point (x1, y1) to the subpath, and connect that point to the previous point (x0, y0) by a straight line. -- Philip Taylor exc...@gmail.com
[whatwg] /html with omitted tags
I can start with a simple document that's probably conforming and that the validator doesn't complain about: !DOCTYPE htmlhtmlheadtitle/title/headbody/body/html Then I can read the Writing HTML document: Optional tags section, which says: A head element's end tag may be omitted if the head element is not immediately followed by a space character or a comment. A body element's start tag may be omitted if the first thing inside the body element is not a space character or a comment, except if the first thing inside the body element is a script or style element. A body element's end tag may be omitted if the body element is not immediately followed by a comment. So I choose to omit the /headbody/body because I think those rules say I can do so. I get: !DOCTYPE htmlhtmlheadtitle/title/html But now I get a parse error, which I think is because the /html comes in the in head insertion mode and is Any other end tag: Parse error. Ignore the token., so something seems wrong. -- Philip Taylor exc...@gmail.com
Re: [whatwg] 8.2.4.37: EOF handling
On Mon, Dec 22, 2008 at 9:33 PM, Edward Z. Yang edwardzy...@thewritingpot.com wrote: Hello all, I think EOF should be handled explicitly in the states after we Consume the U+0023 NUMBER SIGN, since the spec as it stands right now implies that there will always be another character after the number sign. Or am I being a little redundant? EOF is always treated as if it were a character, e.g. lots of places say Consume the next input character: ... EOF - ... Reconsume the EOF character in the data state. If you have # at the end of a file, the next character is the EOF character, which is not 'x' or 'X' and so it is anything else. So it seems consistent and unambiguous to me. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Consuming ampersands
On Tue, Dec 23, 2008 at 1:08 AM, Edward Z. Yang edwardzy...@thewritingpot.com wrote: Hello all, When I'm consuming a character reference, when does the ampersand get consumed? This doesn't seem to be obvious from the documentation, which talks of consuming character references and number hash signs, but never the ampersand. They're consumed in the state that comes before the character reference state, e.g.: 8.2.4.1 Data state Consume the next input character: - U+0026 AMPERSAND () ... switch to the character reference data state. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Byte-wise tokenization algorithm
On Sun, Dec 21, 2008 at 5:41 AM, Ian Hickson i...@hixie.ch wrote: On Sat, 20 Dec 2008, Edward Z. Yang wrote: 1. Given an input stream that is known to be valid UTF-8, is it possible to implement the tokenization algorithm with byte-wise operations only? I think it's possible, since all of the character matching parts of the algorithm map to characters in ASCII space. Yes. (At least, that's the intent; if you find anything that contradicts that, please let me know.) I think there are some cases where it still should work but you might have to be a little careful - e.g. tablefoo notionally results in three parse errors according to the spec (one for each character token which gets foster-parented), so table☹ results in one if you work with Unicode characters but three if you treat each UTF-8 byte as a separate character token. But in practice, tokenisers emit sequence-of-many-characters tokens instead of single-character tokens, so they only emit one parse error for tablefoo, and the html5lib test cases assume that behaviour, and it should work identically if you have sequence-of-many-bytes tokens instead. (Apparently only the distinction between 0 and more-than-0 parse errors is important as far as the spec is concerned, since that has an effect on whether the document is conforming; but it seems useful for implementors to share test cases that are precise about exactly where all the parse errors are emitted, since that helps find bugs, and so the parse error count is relevant.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Solving the login/logout problem in HTML
On Wed, Nov 26, 2008 at 10:12 AM, Ian Hickson [EMAIL PROTECTED] wrote: On Wed, 26 Nov 2008, Julian Reschke wrote: Ian Hickson wrote: ... As can be seen in the feedback below, there is interest in improving the So when you get to a page that expects you to be logged in, it return a 401 with: WWW-Authenticate: HTML form=login ...and there must be a form element with name=login, which represents the form that must be submitted to log in. ... For security reasons, I'd prefer that to be the form element, instead of a form element -- having multiple copies of the name in the same document should be considered a fatal error. Having multiple form elements with the same name is already an error. I'm not sure what you mean by fatal error. The spec precisely defines which form should be used in the case of multiple forms with the same name. Could you describe the attack scenario you are considering? If I'm not misunderstanding things, there is a new attack scenario: I post a comment on someone's blog, saying a href=/restricted-access.php?xsshole=form action=http://hacker.example.com/capture name=logininput name=usernameinput name=password/formcrawl me!/a On their blog's web server, restricted-access.php require authentication, and unauthenticated access results in a 401 with 'WWW-Authenticate: HTML form=login' and the appropriate login form. But inevitably there's some kind of XSS hole in that page, so arbitrary markup can be inserted above the real login form. (Maybe they pass an error message in a parameter, which will be displayed above the form, but they forgot to escape the output.) Their internal search engine crawler is configured to know a username and password (and the form field names etc) for these restricted areas. It follows the link from my blog comment, it notices the WWW-Authenticate header, and like a good little bot it chooses to parse the HTML page and find the matching form and fill in the fields and submit the login details. But actually it's submitting my XSS-inserted form, and sending the login details to me. XSS holes already cause various security vulnerabilities; but they can't currently result in sensibly-written crawlers unwittingly submitting their login details to arbitrary third parties, so this is a new risk. I can imagine a few ways to avoid this problem: 1) Don't write any pages with XSS holes. 2) Detect tampering by refusing to submit login details if = 2 forms match the name. 3) Only submit login details to same-origin URLs, or to some other restricted set. 4) Configure the crawler with the form submission URL, as well as the form field names and values, so it doesn't have to trust the HTML. 5) Change WWW-Authenticate so it gives all the details (submission URL, field names, etc), so nobody has to trust the HTML. But (1) is not going to happen in reality, so we should try to minimise the damage when XSS holes exist. (2) won't work because the attacker could write '...?xsshole=...!--' and the second form would be hidden. (3) is more sensible; perhaps the spec should explicitly note that you need to be quite careful about not submitting login forms to third-party sites unless you're sure you trust them? But even with (3), I could write a href=/restricted-access.php?xsshole=form action=/public-pastebin.php... and the crawler would send the login details to somewhere on the same host where I could still read them back, which doesn't seem great. So (4) is more sensible. You already have to configure the crawler with the form field names, so you might as well tell it what URL to submit to, and it shouldn't parse the HTML response or care about the form element. (Then there's no need for WWW-Authenticate to even say what the form name is.) (5) is basically the same, except it's late-binding the form details rather than hardcoding them into the crawler's configuration, and so it makes it easy to change the server-side login handling without reconfiguring everyone's crawlers. (But the cost of the potential solutions to the vulnerability might be greater than the cost of the vulnerability, so it might not be worth doing anything - I don't have a useful opinion on that.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Absent rev?
On Wed, Nov 19, 2008 at 9:35 AM, Martin McEvoy [EMAIL PROTECTED] wrote: [...] http://code.google.com/webstats/2005-12/linkrels.html [...] If you have a more up to date study on link relationships, please can I have a link? http://philip.html5.org/data/link-rel-rev.txt has some more recent data, from a different set of pages (and so with different biases, e.g. there's lots of Wikipedia and IMDB pages using rel=apple-touch-icon), with less processing (no case-insensitivity or token-splitting). -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Absent rev?
On Wed, Nov 19, 2008 at 1:03 PM, Martin McEvoy [EMAIL PROTECTED] wrote: Philip Taylor wrote: http://philip.html5.org/data/link-rel-rev.txt has some more recent data, from a different set of pages (and so with different biases, e.g. there's lots of Wikipedia and IMDB pages using rel=apple-touch-icon), with less processing (no case-insensitivity or token-splitting). Thank you Philip that is the most useful set of data I have seen for a long time It basically says that the whole premise that HTML5 should drop *rev* (a) because authors use it wrong, (b) Many authors use rev-stylesheet wrong, is a MYTH and an inaccurate assessment of the *rev* attribute Out of the 127249 pages studied, only 0.09% actually use rev=stylesheet The premise from near the beginning of this thread was: We did some studies and found that the attribute was almost never used, and most of the time, when it was used, it was a typo where someone meant to write rel= but wrote rev=. I think that ought to say ... (excluding rev=made, which is uninteresting since it's redundant with rel=author) In that case, rev is used on 0.2% of pages, which justifies the claim almost never used. And rev=stylesheet makes up 57% of those uses of rev, which justifies the claim most of the time ... it was a typo (under a loose definition of typo that includes people copying-and-pasting without understanding the distinction between rel and rev, which is the impression I get from looking at some of these pages). And looking at some other values, e.g. link rev=start href=/ title=Home Page / which seems like it ought to be rel instead, there are typos in more cases than just rev=stylesheet. So the premise seems valid. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Absent rev?
On Wed, Nov 19, 2008 at 2:54 PM, Martin McEvoy [EMAIL PROTECTED] wrote: Philip Taylor wrote: rev=stylesheet makes up 57% of those uses of rev, How do you get that figure? even if you just compare rev=made(1157 instances) and rev=stylesheet(107 instances) you get 9.25% of the examples use rev incorrectly That figure was from the case of ... (excluding rev=made, which is uninteresting since it's redundant with rel=author) since that appears to be what Hixie meant (but forgot to say) when claiming that most uses of rev were typos of rel. (Case-insensitively, I counted 1259 rev=made, 122 rev=stylesheet, and 1474 rev=... in total, which means 215 in total excluding rev=made, and 122/215=57%.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] fixing the authentication problem
On Tue, Oct 21, 2008 at 2:16 PM, Aaron Swartz [EMAIL PROTECTED] wrote: The most common way of authenticating to web applications is: Client: GET /login Server: htmlform method=post Client: POST /login user=joesmith01password=secret Server: 200 OK Set-Cookie: acct=joesmith01,2008-10-21,sj89d89asd89s8d [...] My proposal: add something to HTML5 so that the transaction looks like this: Client: GET /login Server: htmlform method=post pubkey=/pubkey.key... Client: POST /login dXNlcj1qb2VzbWl0aDAxJnBhc3N3b3JkPXNlY3JldA== Server: 200 OK Set-Cookie: acct=joesmith01,2008-10-21,sj89d89asd89s8d where the base64 string is the form data encrypted with the key downloaded from /pubkey.key. As I understand it: As an attacker, I can intercept that dXN... string. Then I can simply make a login POST request myself at any time in the future, sending the same encrypted string, and will get the valid login cookies even though I don't know the password. So it doesn't seem to work very well at keeping me out of the user's account. Also this seems vulnerable to dictionary attacks, e.g. I can easily encrypt user=joesmith01password=... for every word in the dictionary and will probably discover the user's password. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] fixing the authentication problem
On Tue, Oct 21, 2008 at 2:52 PM, Aaron Swartz [EMAIL PROTECTED] wrote: As I understand it: As an attacker, I can intercept that dXN... string. Then I can simply make a login POST request myself at any time in the future, sending the same encrypted string, and will get the valid login cookies even though I don't know the password. So it doesn't seem to work very well at keeping me out of the user's account. Also this seems vulnerable to dictionary attacks, e.g. I can easily encrypt user=joesmith01password=... for every word in the dictionary and will probably discover the user's password. I was simplifying; [...] Simplifications make it hard to tell whether it's possible to use the feature securely (and hard to tell what securely means in this context), which is a necessary condition for usefulness, so it's probably best to explain in detail exactly how you expect it'll be used, and then people can try to pick holes in it :-) . (But at least in my case, I know little enough about security that even if I can't pick holes then I'd be unwilling to assume it's secure...) in real life, I expect the server will include a nonce with the form (as a hidden input), which they'll only permit to be used once. That still doesn't help with the dictionary attacks, since the attacker knows the nonce too. I'd guess the client has to add an extra nonce (which is never transmitted in the clear) to avoid that problem. For the server-generated nonce, the login form will have to be on a page that is never cached, so that every client will get a new nonce every time they load the page. That would prevent it being used in a lot of cases where sites put a login box on every page (instead of requiring the user to go through an extra login page), which is a minor disadvantage of this scheme. How will the server limit each nonce to being used once? If it stores a list of every nonce that was ever used, it's going to be a pretty large table and slow to check on any reasonably popular site. If it encodes a timestamp in the nonce, it won't work if a user opens the login page (causing the new nonce to be generated) in a background tab and leaves it for a few days before trying to log in, which breaks the usually-valid assumption that you can wait indefinitely between separate HTTP requests. (Digest authentication avoids that problem because it's defined at the HTTP level and can say that the browser ought to respond immediately and to retry silently if the nonce was stale.) Probably more importantly, does this solve any of the security flaws you indicated Digest authentication has? (i.e. how would it be better than inventing a mechanism for allow custom styling of the browser's username/password dialog box?) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] canvas shadow compositing oddities
On Sun, Jul 27, 2008 at 8:06 PM, Eric Butler [EMAIL PROTECTED] wrote: [...] However, following the spec's drawing model, there are a few operators that behave rather unexpectedly if the shadow color is left at its default value. For instance, since A in B always results in transparency if either A or B is fully transparent, source-in will always simply clear the clipping region to fully transparent no matter what the source and destination are. Oops - that does seem quite broken. (It's probably my fault - I didn't notice that problem when I was looking at how shadows should work...) It would seem Safari isn't quite following the spec here, since it appears to never draw shadows when the shadow color is fully transparent or something and doesn't encounter these issues. As far as I can tell: It never draws shadows when shadowColor.alpha 1/256, regardless of the other attributes. Also, it never draws shadows when blur=0 and abs(offsetX) = 1 and abs(offsetY) = 1, regardless of the colour. In the cases where it does draw shadows, there's also an issue that its compositor ignores the area outside the shape that's being drawn (instead of treating it as transparent-black, as is required by the spec and implemented by Opera and (usually) Firefox) - so in cases like http://philip.html5.org/demos/canvas/shadow-composite.html with the source-in mode, WebKit fails to clear the area outside the shape/shadow to transparent-black. (I'm testing with Safari 3.0.4 - I hope not much has changed since then). That is probably a sufficiently unusual situation that it's sensible for the spec to stay as it is and require WebKit to change, though the spec still needs to change for the default shadows-disabled case. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] [canvas] imageRenderingQuality property
On 01/07/2008, Ian Hickson [EMAIL PROTECTED] wrote: [...] It seems better for the browser to simply detect when the graphics burden being placed on the hardware by the page is too much to be done at high quality given the current load on the CPU, and for the browser to automatically drop down to a lower fidelity, higher speed rendering on the fly when appropriate. Sometimes the author will want to force best-quality rendering, regardless of the performance impact. E.g. a photo manipulation application might let you resize a segment of a photo, displaying a live preview (where performance is more important than quality), and then render the final resized image and store it in a canvas for future processing. That final rendering needs to be the best possible quality, so it's not acceptable for the browser to decide that it should semi-randomly drop the quality because it detected the live preview was CPU-intensive. Similarly, a pseudo-3d FPS game might load textures at runtime and perform some preprocessing (like resizing to be square, and rendering lots of smaller copies to be used as mipmaps for distant walls so they look prettier), and then draw that processed texture into the game thousands of times a second. Since the preprocessing is only done once, and its result is reused for the whole of the rest of the game, it should be done at the highest possible quality, regardless of performance. So, adaptively reducing the quality and allowing no author control seems like a bad idea. Perhaps the imageRenderingQuality property could have values 'high' and 'auto', where the default is 'high' (so that existing content continues working the same as it always has, and to avoid surprising authors by randomly switching the rendering quality when they have no reason to expect such weird behaviour), and 'auto' means 'low (but perhaps switch to high if the browser thinks it's going to be fast enough)'. That would avoid the issue of authors setting quality='low' and preventing high-speed users from getting the best quality output. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas tests updated
I've recently updated my canvas tests at http://philip.html5.org/tests/canvas/suite/tests/ so they ought to be up-to-date with the latest version of the spec, and have greater coverage than before. (Text rendering is the only thing that's intentionally untested, though I may have missed some other smaller areas.) http://philip.html5.org/tests/canvas/suite/tests/results.html shows the results I got from testing various browsers. Using the totally unfair unrepresentative biased method of counting the number of passes, the latest versions of Opera and WebKit and Konqueror are roughly tied in first place while Firefox is last (except for IE which doesn't really count), but even the best fail about a quarter of the tests, so there's plenty of scope for bug-fixing :-) Anyway, it's quite likely that a number of the tests are incorrect, since nobody (including me) has reviewed them carefully; or that the tests are correct according to the spec but the spec is incorrect according to reality; so it would be good to get feedback from anyone who notices issues in them. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] commit-watchers mail format
The mails sent to [EMAIL PROTECTED] are not very user-friendly. In particular, I collect them in Gmail and 'star' interesting ones that I want to look at in more detail in the future. When looking at the list of starred emails, I see: whatwg WHATWG [html5] r1771 - / - Author: ianh Date: 2008-06-13 01:59:40 -0700 (Fri, 13 Jun 2008) New Revision: 1771 Modified … 13 Jun whatwg WHATWG [html5] r1770 - / - Author: ianh Date: 2008-06-13 01:49:25 -0700 (Fri, 13 Jun 2008) New Revision: 1770 Modified … 13 Jun whatwg WHATWG [html5] r1768 - / - Author: ianh Date: 2008-06-13 01:22:57 -0700 (Fri, 13 Jun 2008) New Revision: 1768 Modified … 13 Jun whatwg WHATWG [html5] r1767 - / - Author: ianh Date: 2008-06-13 01:12:01 -0700 (Fri, 13 Jun 2008) New Revision: 1767 Modified … 13 Jun which makes it impossible to work out what a given email is about, or to find the email that's about a given change. So it could be nice if the commit message was in the subject line, or at the top of the body (so it would appear in Gmail's content snippet thing). -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Bad CSS on the multipage version
On 04/06/2008, Křištof Želechovski [EMAIL PROTECTED] wrote: Regarding your page at the URL http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level.html #the-embed: [...] Element headings (level 4) are invisible (obscured underneath the following content). Seems to be an IE CSS bug like in http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0D%0A%3Cstyle%3E%0D%0A%20.a%20%7B%20border%3A2px%20blue%20solid%20%7D%0D%0A%20.b%20%7B%20border%3A2px%20green%20solid%3B%20background%3Ayellow%3B%20margin-top%3A-0.8em%20%7D%0D%0A%3C%2Fstyle%3E%0D%0A%3Cdiv%20class%3Da%3EThis%20text%20should%20be%20visible%20on%20top%20of%20the%20yellow%0D%0A%20%3Cdiv%20class%3Db%3E...%3C%2Fdiv%3E%0D%0A%3C%2Fdiv%3E That case fails in IE7; it works in IE8 (and in recent versions of Firefox, Opera, Safari, Konqueror). I don't know if there's a 'proper' way to fix this, but adding h4 { position: relative; } into the page's CSS makes it work correctly in IE7, and doesn't affect any other browser. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] More ImageData issues
On 22/02/2008, Oliver Hunt [EMAIL PROTECTED] wrote: At the moment the spec merely states that putImageData(getImageData(sx,sy,..),sx,sy) should not result in any visible change to the canvas, however for those implementations that use a premultiplied buffer there is a necessary premultiplication stage during blitting that results in a loss of precision in some circumstances -- the most obvious being the case of alpha == 0, but many other cases exist, eg. (254, 254, 254, alpha 255). This loss of precision has no actual effect on the visible output, but does mean that in the following case: imageData = context.getImageData(0,0,...); imageData.data[0]=254; imageData.data[1]=254; imageData.data[2]=254; imageData.data[3]=1; context.putImageData(imageData,0,0); imageData2.data = context.getImageData(0,0,...); At this point implementations that use premultiplied buffers can't guarantee imageData.data[0] == imageData2.data[0] Currently no UA can guarantee a roundtrip so i would suggest the spec be updated to state that implementations do not have to guarantee a roundtrip for any pixel where alpha 255. The spec does not state that getImageData(putImageData(data)) == data, which is where the problem would occur. It only states that putImageData(getImageData) == identity function, which is not a problem for premultiplied implementations (since the conversion from premultiplied to non-premultiplied is lossless and reversible). So I don't think the spec needs to change at all (except that it could have a note mentioning the issue). (getImageData can convert internal premultiplied (pr,pg,pb,a) into ImageData's (r,g,b,a): if (a == 0) { r = g = b = 0; } else { r = (pr * 255) / a; g = (pg * 255) / a; b = (pb * 255) / a; } (using round-to-zero integer division). putImageData can convert the other way: pr = (r*a + 254) / 255; pg = (g*a + 254) / 255; pb = (b*a + 254) / 255; Then put(get()) has no effect on the values in the premultiplied buffer.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Feeedback on dfn, abbr, and other elements related to cross-references
On 21/04/2008, Smylers [EMAIL PROTECTED] wrote: Can you link to examples of such webpages, which have abbr elements without title attibutes? What does that mark-up currently achieve? Out of 130K pages from dmoz.org, I see 592 using abbr elements, and 36 of those using it at least once with no title attribute. If anyone cares enough, they could look through the list to see how many are bogus and how many are expecting something useful and what they seem to be expecting. Those 36 pages which used abbr with no title a couple of months ago: http://bundesrecht.juris.de/gsgv_9 http://linuxdidattica.org/ http://markcronan.livejournal.com/33814.html http://observer.guardian.co.uk/politics/story/0,6903,449920,00.html http://outer-court.com/goodies/index.htm http://spazioinwind.libero.it/saf/ http://tubewhore.livejournal.com/ http://www.artofeurope.com/wong/ http://www.beepworld.de/members10/princessa18/ http://www.cs.tut.fi/~jkorpela/latinaohje.html http://www.danscamera.com/ http://www.fwbosheffield.org/ http://www.gnu.org/ http://www.jokan.de/technik-c2.html http://www.mozilla.org/directory/ http://www.mozilla.org/projects/mathml/ http://www.offaly.ie/offalyhome/visitoffaly/Attractions/Family/bog+train.htm http://www.rekordbog.dk/ http://www.seobythesea.com/ http://www.travelphp.com/ http://www.treseta.fi/ http://www.voyager.prima.de/cpp/books1.html http://www.w3.org/TR/XMLHttpRequest/ (plus 5 more on guardian.co.uk, and 8 more on beepworld.de) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] ALT and equivalent representation
On 18/04/2008, Bill Mason [EMAIL PROTECTED] wrote: The example was a case of a hacker who replaces the Google logo on google.com with an image only containing the text WE HACKED YOUR SERVERS. We assume the hacker cares enough about accessibility to set the alt attribute to the same text. More generally (and less hypothetically), this is any case where an image is being used just to display text (in a nicer font, or nicer colours, or animated and on fire, or some other reason it's worth using an image instead of plain HTML). Since the image is no longer the company logo, it falls outside the logo discussion in the Icons requirement for alt. I believe the company logo case is also unclear in the spec. See e.g. http://www.google.com/ (when it's not a special day) - the image is simply the word Google (as a page heading, so it should probably be in h1), so common sense says it should have alt=Google. The spec phrase Icons: a short phrase or label with an alternative graphical representation sounds like it might apply here, but none of the cases in that section seems to work: in particular, I don't think the logo is being used to represent the entity would apply, because the purpose of the image is not to represent the entity (as it would be in e.g. a list of search engines that shows small images of all their logos so you can choose your favourite), and instead its purpose is to tell users what site they are on (and to make it look prettier). It should be made clearer whether the existing case does or does not apply. If it does not apply, it should be made clear what alt text to use instead. Since we're on this topic... What should happen for 'tracker' images? (i.e. img src=http://evil.google.com/user-track.php?site=97519340; width=1 height=1 alt=???) As some examples, Geocities has alt=setstats, someone has alt=statystyka, someone has alt=CrawlTrack: free crawlers and spiders tracking script for webmaster- SEO script -script gratuit de détection des robots pour webmaster, etc, and those examples do not help users who are seeing the alt text. Such images are pretty common, and they're not going to go away, so we should minimise their harm by saying alt= is appropriate. None of the cases in the spec seem to cover this case yet. http://validator.nu/?doc=http%3A%2F%2Fwww.google.com%2Fshowimagereport=yesshowsource=yes shows that some versions of Google (depending on cookies, IP address, etc) implement the Google logo as four separate images, approximately like: .--.-.. | G o o|g|l e | '--+-+' '-' Suomi where the Suomi (text, not image) is adjacent to the g's descender. The Goo image has alt=Google, and the other three images have alt=. When the page is viewed without images, that means it will say Google instead of the logo, which is a good thing. But HTML5 says that the alt text is equivalent to the image, which is not true (and could only be satisfied by alt=Goo, alt=le, alt=Most of a g, alt=A little bit of a g, which would be silly) - in this case, it is the combination of alt texts on the whole page that is equivalent to the combination of images on the page. google.com is splitting the image up to fit it in a layout table, which is non-conforming HTML5; but there are other more legitimate reasons for having several img elements representing a single piece of text, and in those cases it seems sensible to put alt=all the text on one image and alt= on the others. Should HTML5 be changed to accept this? And as a more general point, the spec provides a list of cases for using img (and how to use alt for those cases), but this list will never be complete (especially since the case matches are all subjective and open to interpretation in multiple ways), so there needs to be a default case statement for images where the author doesn't think any of the specific requirements applies. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Question about the PICS label in HTML5
On 16/04/2008, David Gerard [EMAIL PROTECTED] wrote: I may have missed it, but does anyone, anywhere, actually use PICS? I don't think I've even heard the name uttered in a few years - I assumed it had died of neglect and lack of interest. About 1% of the pages listed on dmoz.org attempt to use it - see http://philip.html5.org/data/pics-label.html (I have no idea how many of those uses are syntactically valid (maybe someone could test that if they're quite bored), or are appropriate for the page's content.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] A comment to character encoding declaration
On 03/03/2008, Jjgod Jiang [EMAIL PROTECTED] wrote: During the development of CJK information processing, many text encodings is just a strict subset of another one, for example, GB2312 is a subset of GBK, GBK is a subset of GB18030. For compatibility purpose, a lot of web pages used character encoding declaration like this: meta http-equiv=Content-Type content=text/html; charset=gb2312 in their header, yet they might use characters in GBK but not in GB2312. So, I think we can suggest clients to simply treat encodings like these as their biggest superset, for instance, treat GB2312 as GB18030. Out of 130K pages from dmoz.org, I see 760 which are declared as gb2312 (by HTTP Content-Type, meta content, etc). Of those 760, 120 cause decoding errors in ICU4J when treated as gb2312. 8 cause errors when treated as gbk, and the same 8 cause errors as gb18030. Those 8 are: http://www.bigm.com.cn/dinosaur/anecdote/ http://www.ccpc.edu.cn http://www.gdoverseaschn.com.cn/ http://www.jgbr.com.cn http://www.liechebuluo.com http://www.netbro.com.cn http://www.tkdts.com http://www.wuxi-accp.com/ and I haven't tried working out why they are causing errors. The 120 are listed at http://philip.html5.org/data/gb2312-errors.txt. I don't know how many are really using gb18030, and how many are not actually gb* but happen to be decoded without errors because they use compatible byte sequences; but it does look like gb2312 is a fairly significant problem if it's not treated as gbk/gb18030, so it would be helpful to suggest/require it to be processed specially. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Proposal for a link attribute to replace a href
On 28/02/2008, Shannon [EMAIL PROTECTED] wrote: http://wiki.whatwg.org/wiki/FAQ#Does_HTML5_support_href_on_any_element_like_XHTML_2.0.3F So 'backwards-compatibility', as defined by the same document, can be achieved by using javascript to walk the DOM and add 'window.location(node.getAttribute('link'))' to the onclick handler of any nodes with a link attribute. I have done a very similar thing before to implement :hover on non-anchor elements in IE. I imagine the script would have problems with incremental loading - if someone clicks a link before the page has finished loading and before the script has executed, then it won't work. Is there a way to avoid that problem and make it work as well as a real implementation? There are also tools like search engines that need to recognise links and can't be fixed by compatibility scripts. (Fortunately it's much easier to upgrade all the world's search engines than all its web browsers, so this is a less significant issue than with browsers.) A global attribute offers several features that a does not - most importantly nested links and the ability to hyperlink block and interactive elements without breaking validation. Are there cases where div ...a href=... style=display:block; width:100%; height:100% ... /a/div is not adequate for making block links? FAQ: * It doesn't make sense for all elements, such as interactive elements like input and button, where the use of href would interfere with their normal function. As long as the spec is clear about which actions take precedence then this is not an issue. Having to make the spec clear is an issue :-) It would take quite a bit of effort to design and specify the feature in sufficient detail, and to write test cases, and to update the spec in response to implementor feedback about what would cause them fewer problems. That is all much harder when the new feature interacts with a lot of existing features (inputs, buttons, image maps, iframes, DOM events, etc), compared to something fairly self-contained (like video). How is a global link/href any more difficult than the existing implementations of onmouseup/down/whatever? It's basically the same thing - only *simpler* (no scripting, events, bubbling, etc). As far as I'm aware, HTML elements currently have at most one default click-event handler and any number of DOM handlers. The new link attribute wouldn't be a DOM event handler (since it ought to behave much more like a href), so it would either be an entirely new type of event handler or it would break the assumption that there is a single default handler, and I can imagine that that would cause difficulties. (But I have no implementation experience so I could be entirely wrong.) There are cases like button type=submit link=... onclick=event.preventDefault() button type=submit link=javascript:event.preventDefault() a href=1 link=2 onclick=window.location=3 a href=1 link=2 onclick=window.location=3; return false etc where the interaction between several aspects of the event model would have to be defined (and implemented and tested), requiring some new complexity on top of the current href+onclick model. Another issue is that a href has a number of other attributes for links, and it would be bad design if a generalisation of href didn't allow those attributes on other elements. That includes 'target' (conflicts with base target, form target), 'type' (conflicts with style type, script type, embed type, object type), 'media' (conflicts with style media, link media), etc. Is there a nice way to solve those conflicts? Renaming the link attributes (the same as renaming 'href' to 'link') would be confusing to people who already know HTML, and it would be hard to find good names that aren't used already. Defining lots of exceptional cases for certain attributes on certain elements would make the language harder to understand and implement and test. Defining exceptions for a category of 'non-visible elements' (script, style, etc) wouldn't work since script style=display:block is not non-visible. I'm not sure how this could be made to work well. Shannon -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] html start tag token in the root element phase
On 29/06/2007, Henri Sivonen [EMAIL PROTECTED] wrote: If the spec dealt with the html start tag token directly in the root element phase, the parse error in the main phase wouldn't need to be conditional. (Implementations that experience a perf benefit from not mutating the attributes of a node probably want to hoist the html node creation to the root element phase for perf reasons, too.) There's also an issue with: !doctype html foo html not producing any parse error, because the html is the first start tag token (at least under my interpretation) and therefore is considered valid. Handling html specially in the root element phase seems like a reasonable way of fixing this. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Canvas line styles comments
Some comments on the newly modified version: The lineCap attribute defines the type of endings that UAs shall place on the end of lines. - it seems weird to use shall, since this is the only place in the spec (except the list of RFC2119 keywords) that uses it. The other line* properties don't try define to conformance requirements like that (e.g. they say The lineWidth attribute gives the width of lines which is only informative), so I can't tell whether the lineCap one is trying to be a requirement. The lineJoin attribute defines the type of corners that that UAs will place where two lines meet. - s/that that/that/ A join exists at any point in a subpath shared by two consecutive pairs of lines. - should be two consecutive lines or a consecutive pair of lines. In addition to the point where the join occurs, two additional points are relevant to each join: the corners found half the line width away from the join point, perpendicular to the two lines joining at the join point. - I'm not sure what that means. Nothing can be perpendicular to both of the two lines (unless they're parallel). For each line, there are the two corners half the line width away from the join point perpendicular to that line, but that gives four corners in total. I suppose it'd be alright to say there's four corners, and then talk about the two corners on the outside of the join since the meaning of outside is obvious enough even if it's not defined (at least when the lines aren't parallel). A filled triangle connecting ... with the third point of the triangle being the point of the join itself (where the lines touch on the inside of the join), must be rendered at all joins. - the inside of the join bit seems unhelpful and unclear (since it's not the opposite of the outside of the join) - it'd be better just to say ... being the join point, must be ..., since that's the term used earlier for that point. The round value means that a filled arc connecting the two corners on the outside of the join, with the diameter equal to the line width and the origin at the point of the join, must be rendered at joins. - if I was being pedantic (which I am) I'd say there's two possible arcs connecting those two corners (one clockwise, one anticlockwise), so it should specify which one is meant. But I don't know how to easily say that, and an implementor would have to be silly to do it the wrong way, so maybe a precise definition isn't needed. Should lineJoin='round';moveTo(0,0);lineTo(100,0);lineTo(0,0);stroke() draw a semicircle at (100,0) pointing rightwards? There is no outside of the join there, so the spec doesn't say what should happen. The miter value means that a filled four-sided polygon must be rendered at the join, with two of the lines being the perpendicular edges of the joining lines, ... - the miter-polygon lines aren't the perpendicular edges - they're only half of each edge (between the join point and the outside corners). It's probably easier to define the polygon's points (being the join point, the two outside corners, and the point where the two continuated outside edges intersect). -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Canvas patterns, and miscellaneous other things
On 31/01/2008, Ian Hickson [EMAIL PROTECTED] wrote: I've made toDataURL() return data:, if it's faced with a 0-pixel image. It's arbitrary, but I guess it represents the image, at least! That makes the Note: When trying to use types other than image/png, authors can check if the image was really returned in the requested format by checking to see if the returned string starts with one the exact strings data:image/png, or data:image/png;. now incorrect. The non-image/png format might be unsupported, but someone might be drawing a 0-pixel image and they'll get back something that doesn't start with data:image/png[,;]. It does seem pretty weird to return text/plain content when asked for an image. But I guess it's safer than e.g. returning an empty string, since it won't get misinterpreted as a relative address when people do img.src=canvas.toDataURL(), and I can't think of a better idea. User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations. (See ...#hardwareLimitations.) Does anything say that those limitations should be imposed by throwing an exception, and not by e.g. returning null or aborting the entire script? I'm assuming that the DOM Bindings for JS spec will define how 'undefined' really means 'null' Hmm, I can imagine 'undefined' converted to a DOMString becoming the string undefined. (That's at least what document.createTextNode(undefined) does). But I can just assume for now it's meant to work like null. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Canvas arcTo
On 31/01/2008, Ian Hickson [EMAIL PROTECTED] wrote: On Mon, 2 Jul 2007, Philip Taylor wrote: If the point (x2, y2) is on the line defined by the points (x0, y0) and (x1, y1) then the method must do nothing, as no arc would satisfy the above constraints. - why would no arc satisfy the constraints? If P0, P1, P2 are collinear and non-coincident, then (I think) any of the (infinitely many) circles which have the given radius and touch tangential to the line P0-P2 will satisfy the constraints (i.e. being tangential to P0-P1 at some point and to P1-P2 at some point). The idea is to just take the two (infinite) lines that are defined by the points (end at P1, cross P0 and P2), and draw a circle with the given radius between them. When the lines are the same line (i.e. P0-P1 is parallel to P1-P2) then no circle with a finite non-zero radius can touch the line tangentially at more than two points, since for each half of the circle, every point has a different tangent, and the two points on opposite sides of the circle are tangents to parallel but distinct lines unless the radius is zero. No? The circle can't touch tangentially at two distinct points, but nothing said there had to be two distinct points. There just had to be one point on the circle tangential to one line, and one point tangential to the other line, so they could easily be equal points. About the updated specification: the method must add a point (xinf;, yinf;) - s/inf;/infin;/ the infinite line that crosses the point (x0, y0) and ends at the point (x1, y1) - it could be clearer to say half-infinite line. (It seems the technical term is ray or half-line, but those aren't as clear.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Canvas line styles comments
On 02/02/2008, Kristof Zelechovski [EMAIL PROTECTED] wrote: The rounding arc should be chosen so that it is not contained in the convex hull of the stroke path segments terminated at the points where the arc begins. I believe I can see the idea there, but I can't quite tell what that phrase means about terminating. The contained within also seems inaccurate, because e.g. lineWidth=100;moveTo(0,0);lineTo(1,0);lineTo(1,1) would result in a convex hull that doesn't contain either arc, though I think it'd be alright if said does not intersect instead. A possible alternative that seems simpler and (I think) correct (except in the special parallel case): The rounding arc should be chosen so that if it was closed, it would not contain the join point. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Canvas line styles comments
On 02/02/2008, Kristof Zelechovski [EMAIL PROTECTED] wrote: You considered the convex hull of the original lines to get that paradox; I had the stroke path segments in mind. (Stroke path segments are the path equivalent of the stroked curve when the stroke operator is not allowed and must be replaced by the fill operator). Each line corresponds to two parallel stroke path segments; two of them intersect and the other two get joint with an arc. One of the possible arcs is in the convex hull of those stroke path segments. If the two lines are very short, their stroke paths will (if I understand correctly) look like .-. | | | | | | .-|-*---. '-|-|---' | | | | '-' where the * is the join point and the short lines are the two parallel stroke path segments of each line. Then the convex hull is nearly a square rotated by 45 degrees, like .-. /| |'- / | | '- /| |'-. .-|-*---. '-|-|---' '. | |.-' '-.| |_.-' '-' and so an arc with radius lineWidth/2 from the rightmost point going clockwise to the upmost point will not be contained entirely within that nearly-square. So neither arc is within the convex hull. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] More random comments on the putImageData definition
On 23/01/2008, Oliver Hunt [EMAIL PROTECTED] wrote: It would be great if putImageData could take a source region, in addition to the destination. One of the primary reasons for using get/putImageData is to allow JS to rapidly blit data to the screen, however without an ability to blit only a subregion of the image data the only available options are to either re-blit the entire imagedata region (which can be expensive due to the need for [un]premultiplying in some (all?) implementations), ((Opera does non-premultiplied colour internally.)) or create and populate a new ImageData object which still requires more work than would ideally be necessary. You can also create a temporary canvas and putImageData once onto that, and then drawImage sections onto the screen as they are needed. That lets you draw lots of sections lots of times quickly (since you're mostly drawing from the optimised canvas surface format, not from a JS array), which perhaps helps in some (most?) of the cases. (You still have to do a single putImageData of the whole data to get it onto the temporary canvas, but if there are parts of the data you aren't ever using then you just should make the ImageData smaller and cut out the unused bits.) -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas bits
The ImageData object's width is greater than zero. (and subsequent lines) is wrong, since it's talking about an object that's explicitly not an ImageData. What happens with NaN in imagedata.data? (NaN is a Number, so it's allowed in the data array. It's not below 0, or above 255, and it can't be rounded to the nearest integer.) Note: The transformation is applied to the path when it is drawn - oh no it isn't. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Canvas ImageData comments
On 18/01/2008, Ian Hickson [EMAIL PROTECTED] wrote: On Sat, 16 Jun 2007, Philip Taylor wrote: Colour spaces are not dealt with at all, but are particularly relevant for getImageData (else you have no idea what the values mean). Fixed, in theory. But since I have no idea what I'm talking about here, you'll have to check closely to make sure I didn't babble incoherently. I don't know much about colour spaces either, so someone else should check that it's sane :-) maybe it's safest to just say that all colours throughout the canvas API must be handled consistently in the same colour space (without saying exactly which it is). Wouldn't that mean that different browsers could have different effects when rendering external images -- with gamma -- to the canvas? I guess so; and things like gradients wouldn't work consistently if the colour space wasn't consistent. Maybe the desired properties are: - drawImage(img) onto a displayed canvas should look the same as the original img, regardless of whether the image has gamma etc. - toDataURL should return the same raw pixel data as getImageData, at least for image/png (though other formats might make that impossible), for consistency. - drawImage(toDataURL()) should have no effect. I'd also like: - fillStyle = 'rgb(r, g, b)'; fillRect(...); getImageData returns exactly [r, g, b, 255]. mainly because that makes it possible to write test cases that use getImageData to check the results. I don't know if any of these are wrong, or if others are missing. And I have no idea if this is trivial for implementors, or if it's impossible. So I don't have any useful suggestions. The putImageData(image, dx, dy) method must take the given ImageData structure, and draw it at the specified location dx,dy in the canvas coordinate space, mapping each pixel represented by the ImageData structure into one device pixel. - how should it 'draw it'? Given the requirement on putImageData(getImageData(...)), it has to be replacing the pixels in that area rather than doing anything like normal drawing, but that isn't explicit. Is it better now? It looks clear enough to me. In the example code: [...] function FillCload(data, x, y) { ... } - should be function FillCloud(data, x, y) { ... }. That error was replaced with function AddCload(data, x, y) { ... } - s/a/u/ The width and height (w and h) might be different than the sw and sh arguments to the function - 'different than' sounds a bit odd to me here; maybe I'd prefer 'different from'. Oops, I was wrong to mention that - 'different than' seems to be common in some Englishes, and I don't want to complain when it's just dialect variations. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Minor addition/rewording for canvas section
On 13/01/2008, Oliver Hunt [EMAIL PROTECTED] wrote: Writing to a canvas from a different origin isn't considered a threat, the problem is evil.example.com reading data from the canvas after naive.example.com has put private/confidential information into the canvas. In that case, evil.example.com shouldn't be allowed to read anything (pixel data or context state) from the canvas after naive.example.com has done anything at all to it (e.g. calling fillRect, or setting fillStyle, etc), because otherwise some potentially-private information will be leaked. (putImageData can be emulated using fillRect, so it wouldn't make much sense to have different security restrictions depending on which equivalent mechanism you use.) Don't the normal same-origin restrictions already prevent naive.example.com and evil.example.com accessing the same canvas element, in the same way as (I assume) they prevent evil.example.com accessing an input type=password.value from a naive.example.com document? -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Minor addition/rewording for canvas section
On 13/01/2008, Oliver Hunt [EMAIL PROTECTED] wrote: I did wonder about why other origins could read anything myself, so you're not alone -- it just seemed especially odd to allow images to be written safely but not ImageData. As far as I'm aware, different origins can never read and write the same canvas. Images are given special consideration because scripts already have access to Image objects where the image has a different origin to the script, like: // on a page on www.example.com var img = new Image(); img.onload = function () { ctx.drawImage(img, 0, 0); } img.src = 'http://google.com/images/logo.gif'; The canvas reading/writing all happens in the same origin - it's just the image itself that is not the same origin. The same does not apply to ImageData, because scripts don't have access to ImageData objects from other origins. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] must only ambiguity
Documents and document fragments / Structure says Authors must only use elements in the HTML namespace in the contexts where they are allowed, as defined for each element. That phrase is unclear. It could be interpreted as: Authors must { only use elements in the HTML namespace } in { the contexts where [elements in the HTML namespace] are allowed }, i.e. contexts expecting HTML namespaced elements mustn't contain foreign content. Authors must { [...] use elements in the HTML namespace } [only] { in the contexts where they are allowed }, i.e. HTML elements must not be used where they aren't allowed. Authors must only { use elements in the HTML namespace in the contexts where they are allowed }, i.e. pretty much every imaginable action in the entire world is disallowed, except for using elements where allowed. A suggested replacement: Authors must not use elements in the HTML namespace except where allowed by the context defined for the element. Similarly, Authors must only put elements inside an element if that element allows them to be there according to its content model should be fixed to say something like Authors must not put elements inside an element unless that element allows them to be there according to its content model. More generally, all uses of must only and may only etc seem dangerous. The spec says The key words [...] in the normative parts of this document are to be interpreted as described in RFC2119, but instead they have to be interpreted as described by the standard English grammar rules when they're used in complex phrases like must only, which makes the spec harder to read when you're trying to read the normative requirements, and can cause misunderstanding. (Does that make things particularly harder for non-native-English-speaking people?) The conformance requirements would be clearer if all occurrences of x must only y and x may only y were replaced by x must not { not y } or by x may y, and x must not { not y }. Similarly, x should only y if z (e.g. authors should only use these elements if the absence of those elements would change the meaning of the content) should be replaced by x should y if z, and should not do so otherwise or x should not y if not z (depending on which directions the 'should' applies in). -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] HTML5 and URI Templates
On 17/12/2007, James M Snell [EMAIL PROTECTED] wrote: It should be possible for us to also do something like: form action=http://example.org/form_processor; template=http://example.org?{-join||a,b} method=POST input name=a type=text / input name=b type=text / input name=c type=text / input name=d type=text / /form [...] HTML5 Post: POST /example.org?a=wb=x Host: example.org ... c=yd=z HTML4 Post: POST /form_processor Host: example.org ... a=wb=xc=yd=z - James Presumably people will use more than one templated form on their site, but won't want lots of separate form_processors, so they would have to use form action=http://example.org/form_processor?{-join||a,b} template=http://example.org?{-join||a,b} method=POST or something theoretically more correct like form action=http://example.org/form_processor?%7B-join%7Camp;%7Ca,b%7D; template=http://example.org?{-join|amp;|a,b} method=POST and then they can drop in a standard generic form_processor script to handle everything automatically. Most legacy browser users could be handled by a script which adds onsubmit hooks to rewrite the 'action' attribute before submitting. (I assume that'd work correctly in current browsers, but haven't tested it). That would avoid the need for repeating the template URI twice (with the associated risks of typing one of them wrong and not noticing), if you don't want to handle scriptless users. (How would the script know when it should do the rewriting, and when it should leave everything to the browser? There's no obvious feature test it can perform.) Wondering about why this feature would be used: If everyone who uses template URIs uses these backward-compatibility additions (which they have to, unless they have no users), why would a browser implement native support for template URIs? (The reason I can think of is that it provides a slightly better user experience, because you can go directly to the destination rather than being delayed by a round-trip to form_processor, but that's no faster than the scripted approach.) If everyone who uses template URIs has to use these backward-compatibility additions, why would they go to that effort instead of using some server-side redirection logic to perform the desired processing at the normal non-templated ugly URI? (Maybe it makes the system cleaner if the server code has a nice URI-based API and the client code does the mapping onto that, but I have no idea how much difference it really makes. More significantly, it allows the direct use of external resources that have sufficiently nice URIs but don't have an equivalent GET/POST form-accessible API. I haven't seen any other obvious useful uses yet.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] HTML5 and URI Templates
On 16/12/2007, Henri Sivonen [EMAIL PROTECTED] wrote: On Dec 16, 2007, at 05:28, James M Snell wrote: form template=http://example.org{-prefix|/|foo}?bar={bar} method=POST Foo: input name=foo type=input Bar: input name=bar type=input /form What's the backward-compatibility story of this feature? (Both behavior of URI templates in legacy browsers and ensuring that existing content doesn't use braces.) Out of ~15K random pages from dmoz.org, I see two with braces in form action: http://www.bornsvilkar.dk/ - form name=mainform method=post action=BV.Main.BV.Browse.aspx?path=%2fwww_bornsvilkar_dk%2fbornsvilkaramp;layout={0685D858-53CA-4F7E-A3C8-53D1BD7F277D} id=mainform http://bip.wokiss.pl/margoninm/ - FORM ACTION='index.php?pid=2opcje=a:1:{i:0;s:6:wyszuk;}' METHOD='POST' ENCTYPE='multipart/form-data' name=wyszukiwarka style=margin:0px; But the original example had form template which would avoid that conflict. (The only template attributes I see are one page with widget template and one with edittag:edit template.) -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Compatibility problems with HTML5 Canvas spec.
On 25/09/2007, Oliver Hunt [EMAIL PROTECTED] wrote: Firefox 2/3 and Safari 2 clear the context's path on strokeRect/ fillRect, this violates the spec -- but there are many websites that now rely on such behaviour despite the behaviour defined in hmtl5. This means that those browsers that match the current draft (eg. Safari 3 and Opera 9.x) fail to render these websites correctly. How hard would it be to get those sites fixed? If there are problems in something like PlotKit or Reflection.js, which lots of people copy onto their own servers, then it would be a pain to break compatibility. If it's just sites like canvaspaint.org where there is a single copy of the code and the developer still exists and can update it, it seems a much less significant problem to break compatibility. Unfortunately it isn't really an edge case as it's a relatively common occurance -- people expect that the rect drawing function (for example) will clear the path, so expect clearRect (myCanvasElement.width, myCanvasElement.height) to clear the rect and reset the path, and other similarly exciting things :-/ Firefox also resets the path on drawImage and putImageData, unlike Opera and Safari 3 - do people depend on that behaviour too? --Oliver -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Issues concerning the base element and xml:base
On 07/08/07, Ian Hickson [EMAIL PROTECTED] wrote: This is how it stood back in May (using a sample of several hundred thousand pages taken mostly from the more popular sites); number of unique URIs in base href attributes as a percentage of all pages parsed: 0: 93.7% 1: 6.31% 2: 0.0308% 3: 0.00105% 4: 0.00197% This is how it stands as of today (using the same sampling method): 0: 94.1% 1: 5.93% 2: 0.0215% 3: 0.000928% 4: 0.000288% (All numbers rounded to three significant figures.) That rounding seems quite misleading - if I haven't forgotten how to do statistics, and if the details I am forgetting are not critical ones, and if I'm not misinterpreting how you collected the data, then the samples are independent and from a binomial distribution that can be approximated as a normal distribution with standard deviation sqrt(n*p*(1-p)), and if assuming n=100,000 and guessing p from the data then the 95%-confidence (+/- 2 s.d.) ranges are something like: 0: (93.7 +/- 0.15)% 1: (6.3 +/- 0.15)% 2: (0.03 +/- 0.01)% 3: (0.001 +/- 0.002)% 4: (0.002 +/- 0.003)% and 0: (94.1 +/- 0.15)% 1: (5.9 +/- 0.15)% 2: (0.02 +/- 0.01)% 3: (0.001 +/- 0.002)% 4: (0.0003 +/- 0.001)% (though the normal approximation breaks down in the = 0.002% bits), so you can't determine anything about changes in frequency beyond the zero/one cases. -- Philip Taylor [EMAIL PROTECTED]
[Whatwg] IE-only character entity references
IE undocumentedly recognises some which nobody else does: aafsU+206D ACTIVATE ARABIC FORM SHAPING ass U+206B ACTIVATE SYMMETRIC SWAPPING iafsU+206C INHIBIT ARABIC FORM SHAPING iss U+206A INHIBIT SYMMETRIC SWAPPING lre U+202A LEFT-TO-RIGHT EMBEDDING lro U+202D LEFT-TO-RIGHT OVERRIDE nadsU+206E NATIONAL DIGIT SHAPES nodsU+206F NOMINAL DIGIT SHAPES pdf U+202C POP DIRECTIONAL FORMATTING rle U+202B RIGHT-TO-LEFT EMBEDDING rlo U+202E RIGHT-TO-LEFT OVERRIDE zwspU+200B ZERO WIDTH SPACE (I believe that list is complete.) The first eleven were suggested on https://listserv.heanet.ie/cgi-bin/wa?A2=ind9605L=html-wgP=4579 some time ago but don't seem to have gone very far (except into IE). I can see some legitimate users at http://www.tasb.com/services/field/staff/index.aspx?print=true and http://www.pelesoft.co.il/ and maybe there's a few dozen or hundred more elsewhere (but I can't measure it easily). There's some in text-art at http://yy28.60.kg/test/read.cgi/maido3/1096370177/l50 and quite a lot in weird places like http://cheese.2ch.net/life/kako/1010/10103/1010391447.html or http://zerosen52.gozaru.jp/log/1093422333.html that I don't understand but that seem to all be on 2channel (or copied from it). I've no idea how common they are in general. Are these used significantly on the web, or would they be considered highly useful if anyone knew they existed, or should HTML5 just ignore them? -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] canvas Firefox support for toDataURL broken
On 10/07/07, dev [EMAIL PROTECTED] wrote: Hey, I am not sure if this is the correct place to post this, so forgive me if I am wrong (and point it out too). The spec states the toDataURL(image/svg+xml) should return image in svg format , if it can't support that then png image should be returned. But it seems firefox throws an exception to canvas.toDataURL(image/svg+xml) whereas it should be returning the image in png format. That's correct, and it's just a bug in Firefox. Throwing an exception on toDataURL(image/png, null) is a vaguely similar bug. (Opera agrees with the spec in both cases. Safari doesn't implement toDataURL at all). Probably Firefox should change its behaviour, unless it has good reasons not to, in which case possibly the spec should change to match it. https://bugzilla.mozilla.org/ is the best place for reporting bugs like this, under component 'Core' / 'Layout: Canvas'. (I've got a load of test failures recorded at http://canvex.lazyilluminati.com/tests/tests/results.html, and more from not-quite-finished tests - I've been waiting to have more completeness before reporting all the found bugs, but I keep getting distracted by other things and haven't got around to that yet...) Regards, dev -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] Canvas arcTo
Which straight line do you mean? In the first case, the constraints are: * There is a circle with the given radius. * The infinite line P0-P1 is tangential to that circle. * The infinite line P1-P2 is tangential to that circle. * The Arc is the shortest arc of that circle, between the points where the circle touches the two lines. When P0-P1-P2 is a straight line, there is a circle (among many others) which satisfies the first three constraints, and there is a zero-length arc of that circle which satisfies the fourth constraint. (You can't then re-calculate the circle's radius from the arc, because the arc is just a single point, but I don't think that means the arc doesn't exist as part of a finite circle). That's not very useful when you want to draw stuff since there are infinitely many distinct things you could draw, but it's not the case that there's nothing you could draw. In the second case, there is one distinct circle (with zero radius) which touches both the lines, and there is one distinct point which the start and end tangent points must be equal to, and the shortest arc which joins those two points has zero length. There's still infinitely many such arcs and it gets a bit confusing if you want to work out its direction (in order to draw line joins and caps), but you'd always be drawing at least a line from P0 to P1. (To handle that confusion about the zero-sized arc, I think my earlier suggestion should be modified to say ... Otherwise, if x1=x2 and y1=y2, or if the line defined by the points (x0, y0) and (x1, y1) is parallel and in the same direction as the line defined by the points (x1, y1) and (x2, y2), ** or if radius is zero, ** then the method must connect the point (x0, y0) to the point (x1, y1) by a straight line and add the point (x1, y1) to the subpath. ...) Actually, I just realised there's still a problem in the normal non-parallel non-zero-size case, because there are four different circles which have the two infinite lines as tangents. (And you have to use infinite lines rather than finite lines, to handle the second case in http://canvex.lazyilluminati.com/misc/arcto.html like Safari). So I think it would have to say something like: Otherwise, let L01 be the line through the points (x0, y0) and (x1, y1), and let L12 be the line through the points (x1, y1) and (x2, y2). Consider the circle that has L01 and L12 as tangents, and has its origin and the point (x2, y2) on the same side of L01, and has its origin and the point (x0, y0) on the same side of L12, and has radius radius. The points at which this circle touches these two lines are called the start and end tangent points respectively. Let The Arc be the shortest arc given by the circumference of this circle, joining the start and end tangent points. unless I got anything else wrong. On 03/07/07, Kristof Zelechovski [EMAIL PROTECTED] wrote: The questioned wording is correct: a straight line has infinite radius and thus does not match the requirement if the radius is finite. Chris -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philip Taylor Sent: Monday, July 02, 2007 1:42 PM To: WHATWG Subject: [whatwg] Canvas arcTo If the point (x2, y2) is on the line defined by the points (x0, y0) and (x1, y1) then the method must do nothing, as no arc would satisfy the above constraints. - why would no arc satisfy the constraints? If P0, P1, P2 are collinear and non-coincident, then (I think) any of the (infinitely many) circles which have the given radius and touch tangential to the line P0-P2 will satisfy the constraints (i.e. being tangential to P0-P1 at some point and to P1-P2 at some point). [snip] Negative or zero values for radius must cause the implementation to raise an INDEX_SIZE_ERR exception. - why not allow zero? You just get an arc at P1 with zero length, with the start and end tangent points both at P1, so the effect would be a straight line from P0 to P1, without needing to handle it as a special case. Safari works like that. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas arc
For the 'arc' function: What if startAngle = endAngle? What if endAngle 2π + startAngle? (The endAngle = 2π + startAngle case isn't interesting since floating-point imprecision means it will never occur.) In practice: (see the left half of http://canvex.lazyilluminati.com/misc/arc.html for a (unhelpfully unlabelled) random collection of examples) If startAngle = endAngle: Firefox (2+3), Safari (3): Nothing is drawn. Opera (9.2+9.5): If anticlockwise = true, a full circle is drawn; otherwise, nothing is drawn. If endAngle startAngle + 2π: Opera is weird and buggy and would require too much effort to analyse. Firefox and Safari mostly match: (Assume startAngle = 0 in all the following) If endAngle = 2π + ε [where ε is a small positive real number]: A full circle is drawn. If endAngle = 3π - ε: If anticlockwise, 0 to -π is drawn; otherwise a full circle is drawn, and the 0 to π part is drawn twice (i.e. drawn on top of itself, which is visible due to antialiasing effects). If endAngle = 2nπ - ε for integer n 1: If anticlockwise, nothing is drawn; otherwise: Firefox: A full circle is drawn twice. Safari: A full circle is drawn n times. (Swapping startAngle vs endAngle is equivalent to swapping clockwise vs anticlockwise.) So, for FF/Safari: When startAngle - endAngle is in the opposite direction to the (anti)clockwise flag, the two angles are treated modulo 2π and the arc is drawn between them in the appropriate direction. When it's the same direction as the (anti)clockwise flag, Safari extends the path all the way from startAngle to endAngle (going round the whole circle multiple times if necessary), and Firefox does the same except it skips all but the first full going-round-the-whole-circle bit (so it goes round 1 = n 2 times, if abs(startAngle-endAngle) 2π). It seems sensible to adopt either Firefox's or Safari's approach (which differ only in the amount of overdraw). It's probably easier to use Firefox's, so then Safari would just have to mod the angles a little before drawing them, because I can't see any other reason to choose one approach over the other, and I can't see any reason to choose a totally different approach. Talking about arcs is confusing when the arc is more than a full circle and wraps around itself and isn't really a mathematical arc any more, so I think it's necessary to not define the operation in terms of arcs. The best I can think of is: Let da = endAngle - startAngle. If anticlockwise is true, and da 0 or da -2π, then let d = (da % 2π) - 2π. If anticlockwise is false, and da 0 or da 2π, then let d = (da % 2π) + 2π. If neither of these cases applies, then let d = (da % 2π). In this algorithm, the % operator is defined to have the same semantics as the ECMAScript % operator. The arc is defined by the points (radius*cos(a), radius*sin(a)) for all a between startAngle and startAngle + d. The points at a = startAngle and at a = startAngle + d are the path's start and end points respectively. (The relevance of using the ECMAScript % operator is that (-3) % 2 = -1, etc, so it handles negative numbers (and floating-point numbers) in the way that is needed here, and I can't think of a better way to say the same thing that's still as well-defined and not horribly verbose.) The right half of http://canvex.lazyilluminati.com/misc/arc.html is implemented as above, and gives exactly the same behaviour as FF in all the cases I have tried. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas arcTo
As implemented, the operation of arcTo in Firefox (2, 3) and Opera (9.2, 9.5) is utterly unrelated to the spec and arguably crazy. At least Opera has the right spirit and tries drawing arcs between points, though they're the wrong points and they're always semicircles. Safari nearly matches the spec, and it's still sensible when it disagrees with the spec, so that's the only one that's relevant to consider. There are some examples at http://canvex.lazyilluminati.com/misc/arcto.html. If the point (x2, y2) is on the line defined by the points (x0, y0) and (x1, y1) then the method must do nothing, as no arc would satisfy the above constraints. - why would no arc satisfy the constraints? If P0, P1, P2 are collinear and non-coincident, then (I think) any of the (infinitely many) circles which have the given radius and touch tangential to the line P0-P2 will satisfy the constraints (i.e. being tangential to P0-P1 at some point and to P1-P2 at some point). When P0-P1 and P1-P2 are parallel and the same direction, Safari just draws the line P0-P1. When they are parallel but opposing directions, it instead draws a line from P0 to a point infinitely far from P0 in the direction P1-P2. That is sensible in both cases since it's equal to the limit as the two lines tend towards parallelism. If P0=P1 (and either P2=P1 or P2!=P1) then Safari does nothing at all and does not add any points to the subpath (or, equivalently, it does add the point P1 to the subpath, which has no effect since the line P0-P1 has zero length). If P1=P2 and P0!=P1, then it adds the point P1 to the subpath. Both of these seem generally sane - there's no sensible limit as the points tend towards coincidence, so there's no real correct answer, and drawing the straight line P0-P1 seems an adequate thing to do. Negative or zero values for radius must cause the implementation to raise an INDEX_SIZE_ERR exception. - why not allow zero? You just get an arc at P1 with zero length, with the start and end tangent points both at P1, so the effect would be a straight line from P0 to P1, without needing to handle it as a special case. Safari works like that. So, I think the following definition would cover all the cases and match Safari: The arcTo(x1, y1, x2, y2, radius) method must do nothing if the context has no subpaths. If the context does have a subpath, then the behaviour depends on the arguments and the last point in the subpath. Let the point (x0, y0) be the last point in the subpath. If x0=x1 and y0=y1, then the method must do nothing. Otherwise, if x1=x2 and y1=y2, or if the line defined by the points (x0, y0) and (x1, y1) is parallel and in the same direction as the line defined by the points (x1, y1) and (x2, y2), then the method must connect the point (x0, y0) to the point (x1, y1) by a straight line and add the point (x1, y1) to the subpath. Otherwise, if the line defined by the points (x0, y0) and (x1, y1) is parallel and in the opposite direction to the line defined by the points (x1, y1) and (x2, y2), then the method must connect the point (x0, y0) to the point obtained by extending an infinite distance from (x0, y0) in the direction of the line defined by (x1, y1) and (x2, y2), and add that new point to the subpath. Otherwise, let The Arc be the shortest arc given by the circumference of the circle that has one point tangent to the line defined by the points (x0, y0) and (x1, y1), another point tangent to the line defined by the points (x1, y1) and (x2, y2), and that has radius radius. The points at which this circle touches these two lines are called the start and end tangent points respectively. The method must connect the point (x0, y0) to the start tangent point by a straight line, then connect the start tangent point to the end tangent point by The Arc, and finally add the start and end tangent points to the subpath. Negative values for radius must cause the implementation to raise an INDEX_SIZE_ERR exception. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] WF2 - form action=
WF2 says: When the [form element's action] attribute is absent, UAs must act as if the action attribute was the empty string, which is a relative URI reference, and would thus point to the current document (or the specified base URI, if any). But: http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C%21DOCTYPE%20html%3E%0D%0A%3Cbase%20href%3D%22http%3A//google.com%22%3E%3Cform%3E%3Cinput%20type%3Dsubmit%3E In IE7, FF2, FF3, Opera 9.2, it ignores the base URI and always submits to the current page. In Safari 3, it does take account of the base URI. In all, form action= does the same as form. In all, form action=. does take account of the base URI. Perhaps it would be sensible to follow the majority. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas - non-standard globalCompositeOperation
In addition to the standard values for globalCompositeOperation (and ignoring 'darker'), Gecko supports: clear: The Porter-Duff 'clear' operator, which always sets the output to rgba(0, 0, 0, 0). over: Synonym for 'source-over'. The code says not part of spec, kept here for compat. (It looks like FF1.5 had a broken 'source-over', and implemented 'over' like a correct 'source-over'. 'source-over' was fixed in FF2.0, and 'over' left unchanged.) (See http://lxr.mozilla.org/mozilla/source/content/canvas/src/nsCanvasRenderingContext2D.cpp#1703.) WebKit supports: clear: Same as above. highlight: Synonym for source-over. (See http://developer.apple.com/documentation/Cocoa/Reference/ApplicationKit/Classes/NSImage_Class/Reference/Reference.html#//apple_ref/doc/c_ref/NSCompositeHighlight - NSCompositeHighlight: Deprecated. Mapped to NSCompositeSourceOver.) (See http://trac.webkit.org/projects/webkit/browser/trunk/WebCore/platform/graphics/GraphicsTypes.cpp#L34.) Opera is very nice and doesn't do anything wrong. The spec clearly defines the behaviour here: any attempts to set such values must be ignored. 'clear' is pretty useless, since it's exactly equivalent to doing globalAlpha = 0; globalCompositeOperation = 'copy' or (depending on the transform matrix) clearRect(0, 0, w, h). The spec already omits the Porter-Duff 'B' operator (which sets the output to be equal to the destination bitmap, i.e. is equivalent to not drawing anything at all), so it does not seem reasonable to argue for adding 'clear' just for completeness. I can't think of any other reasons for it to be added to the spec, other than for interoperability. As far as I can imagine, for each non-standard value, the possible situations are: * No content relies on that value. = Web browsers should remove support for it: it has no purpose, and it may result in authors accidentally using that value and becoming confused when their code doesn't work in other browsers which will be irritating for everyone and it will evolve into the next situation: * Web content relies on that value. = It should be added to the spec, because it's necessary for handling web content. * Non-web, browser-specific content (extensions, widgets, etc) relies on that value, and web content doesn't. = It should be disabled except when run in the extension/widget/etc context, to avoid the problems as in the first case. That may cause minor confusion to the extension/widget/etc authors about why their code [which is relying on undocumented features] works differently if they run it on the web instead, but that seems insignificant compared to having interoperability problems on the web. * Nobody cares. = Nothing happens. Am I missing any issues here? Would any browser developer think one of the first three situations applies, and be willing to make the necessary changes in that case? -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas patterns, and miscellaneous other things
What should happen if you try drawing a 0x0-pixel repeating pattern? (I can't find a way to make a 0x0 image that any browser will load, but the spec says you can make a 0x0 canvas. Firefox and Opera can't make a 0x0 canvas - it acts like it's 300x150 pixels instead. Safari returns null from createPattern when it's 0x0.) On a somewhat related note: What should canvas.width = canvas.height = 0; canvas.toDataURL() do, given that you can never make a valid 0x0 PNG? (Firefox and Opera make the canvas 300x150 pixels instead, so you can't actually get it that small. Safari can make it that small, but doesn't implement toDataURL.) Similarly, what should toDataURL do when the canvas is really large and the browser doesn't want to give you a data URI? (Opera returns 'undefined' if it's = 30001 pixels in any dimension, and crashes if it's 3 in each dimension. Firefox (2 and trunk) crashes or hangs on Linux if it's = 32768 pixels in any dimension, and crashes on Windows if it's = 65536 pixels). More generally, the spec says If the user agent does not support the requested type, it must return the image using the PNG format - what if it does support the requested type, but still doesn't want to give you a data URI, e.g. because it's the wrong size (too large, too small, not a multiple of 4, etc) or because of other environmental factors (e.g. it wants you to do getContext('vendor-2d').enableVectorCapture() before toDataURL('image/svg+xml'))? (Presumably it would be some combination of falling back to PNG (if you asked for something else), returning undefined, and throwing exceptions.) If the empty string or null is specified, repeat must be assumed. - why allow null, but not undefined or missing? (It would seem quite reasonable for createPattern(img) to default to a repeating pattern). (Currently all implementations throw exceptions for undefined/missing, and Opera and Safari throw for null.) 'complete' for images is underspecified, so it's not possible to test the related createPattern/drawImage requirements. (Is it set before onload is called? Can it be set as soon as the Image() constructor returns? Can it be set at an arbitrary point during execution of the script that called the Image() constructor? Is it reset when you change src? etc. Implementations all seem to disagree in lots of ways.) About radial gradients: If x0 = x1 and y0 = y1 and r0 = r1, then the radial gradient must paint nothing. - that conflicts with the previous must for following the algorithm, so it's not precise about which you must do. It should probably say If ... then the radial gradient must paint nothing. Otherwise, radial gradients must be rendered by following these steps:. code title=dom-attr-completecomplete/code (twice) - looks like it should be dom-img-complete, so it points to #complete. createPattern(image, repetition) - the parameters should be in vars. The images are not be scaled by this process - s/be // interface HTMLCanvasElement : HTMLElement { attribute unsigned long width; attribute unsigned long height; ^ incorrect indentation (should have two more spaces). Somewhere totally unrelated: interface HTMLDetailsElement : HTMLElement { attribute boolean open; ^ incorrect indentation (should have nine more spaces). -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas line styles comments
.]] * If the value is bevel, then no extra rendering is needed. * If the value is round, then UAs must add a filled arc connecting the corners of the strokes on the outside of the join, with the arc's diameter equal to the line width and with its origin at the point of the join. * If the value is miter, then the intersection point P of the two tangents to the edges of the strokes on the outside of the join is calculated. If the distance from P to the join is greater than or equal to the miter limit ratio multiplied by the line width, then no extra rendering is needed. Otherwise, a triangle must be added between P and the corners of the strokes on the outside of the join. The final stroke shape of a path is the union of the line strokes, line caps and line joins for all of its subpaths. [[In particular, there's no non-zero winding number rule. Also, subpaths aren't drawn separately - they're just combined into one shape which then gets filled and composited.]] ...much later... The stroke() method must calculate the final stroke shape of the current path, using the lineWidth, lineJoin, lineCap, and (if appropriate) miterLimit attributes, and then fill this shape using the strokeStyle attribute. (Hopefully there aren't too many errors in there.) (Is it worth having diagrams (kind of like http://canvex.lazyilluminati.com/misc/linejoin.png), so normal people can tell what the interesting bits here actually mean? Or is that best left for tutorials and user reference guides?) There are some other issues I'm currently aware of, possibly requiring more complexity: What happens when a stroked path has zero length, in terms of drawing the line caps/joins? In particular, square caps are impossible because the line does not have a defined direction (assuming we're not having dashed paths for now). In Firefox 2 and Opera, nothing is drawn for zero-length paths. In Firefox 3 and Safari, round caps/joins are drawn (because the direction of the line doesn't matter in that case, so the output is well-defined), and nothing else is drawn. What happens when a stroked path contains a line with zero length, between non-zero-length lines? As far as I can tell, zero-length lines never have any effect (e.g. line-joins get drawn between two non-consecutive non-zero-length lines if they have only zero-length lines between them, so the earlier suggestion for defining 'join' is wrong) - except when the path has no non-zero-length lines in it, in which case the presence of a zero-width line causes round caps to be drawn in FF3/Safari. (...except in FF3 when it's a zero-length quadratic/Bézier curve). Maybe it'd be best just to require that lines with zero length are never added to the subpath - so if you don't add any non-zero-length ones, the subpath will be empty and won't get drawn, which is slightly incompatible with Safari/FF3 but hopefully easy to fix in them, and compatible with Opera/FF2. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] HTML syntax: comments before doctype and doctype sniffing
On 18/06/07, Ian Hickson [EMAIL PROTECTED] wrote: On Sun, 3 Dec 2006, Simon Pieters wrote: Also, as an additional constraint in the syntax section, the entire doctype probably should (or must) be within the first 1024 bytes, because AFAIK browsers generally only sniff for the first 1024 bytes, and if they don't find the entire doctype within that then you get quirks mode. I couldn't reproduce that. In Firefox 2: javascript:s='?';for(i=0;i1006;++i)s+=' ';window.location='data:text/html,'+s+'!doctype htmlscriptdocument.write(document.compatMode)/script' javascript:s='?';for(i=0;i1007;++i)s+=' ';window.location='data:text/html,'+s+'!doctype htmlscriptdocument.write(document.compatMode)/script' The first produces CSS1Compat, the second BackCompat. As far as I can tell, Firefox requires the doctype to be found when parsing [using standards-mode rules] the first 1024 characters (not bytes) from the first non-whitespace character, and then it reparses the whole document in quirks mode if necessary. -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] HTML syntax: comments before doctype and doctype sniffing
On 18/06/07, Martin Payne [EMAIL PROTECTED] wrote: Philip Taylor wrote: In Firefox 2: javascript:s='?';for(i=0;i1006;++i)s+=' ';window.location='data:text/html,'+s+'!doctype htmlscriptdocument.write(document.compatMode)/script' javascript:s='?';for(i=0;i1007;++i)s+=' ';window.location='data:text/html,'+s+'!doctype htmlscriptdocument.write(document.compatMode)/script' The first produces CSS1Compat, the second BackCompat. Not for me it doesn't (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20070603 Fedora/2.0.0.4-2.fc8 Firefox/2.0.0.4). Both render in standards mode for me. Hmm, that might have been some unfortunate line wrapping - it's probably better to write: javascript:s='?';for(i=0;i1006;++i)s+='%20';window.location='data:text/html,'+s+'!doctype%20htmlscriptdocument.write(document.compatMode)/script' javascript:s='?';for(i=0;i1007;++i)s+='%20';window.location='data:text/html,'+s+'!doctype%20htmlscriptdocument.write(document.compatMode)/script' where each should be one line with no spaces. Then I get the CSS1Compat/BackCompat difference when just copying those into the location bar, in Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4 and Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a6pre) Gecko/20070618 Minefield/3.0a6pre. (IE7 and Opera 9 don't appear to have any limit on how early the doctype should appear.) -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas ImageData comments
. If getContext() is called with that exact string for tis contextId argument ... - s/tis/its/ while one could create an ImageData object, one would net necessarily know what resolution the canvas expected - s/net/not/ -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Canvas shadow rendering
the images (particularly image A) have infinite size, so a shape drawn entirely off-screen can still cast shadows into the visible area. The algorithm as specified is quite horrendously inefficient, but the obvious optimisations are to skip the entire shadow part if the shadow colour is fully transparent, and to perform the Gaussian blur by doing the horizontal and vertical components separately and ignoring the bits where max(u,v) shadowBlur (because G will be so small that its contribution to the shadow will be lost in the rounding errors). I assume implementors can work that out for themselves, since it's just a standard Gaussian blur - the only peculiar bit is the mapping from shadowBlur to σ. So that's all alright. One odd issue is with clearRect - Safari sort of applies shadows to that, as in http://canvex.lazyilluminati.com/misc/shadow/shadow4.html, except it ignores the colour and just uses the blur. It seems sensible to just call that a bug, and require that shadows never apply to clearRect since it doesn't go through the Drawing Model at all. I'll try to look out for any other possible problem areas. -- Philip Taylor [EMAIL PROTECTED]
[whatwg] Numerical imprecision in charset detection
8.2.2. The input stream: If the next six characters are not 'charset' - s/six/seven/ -- Philip Taylor [EMAIL PROTECTED]
Re: [whatwg] noscript should be allowed in head
On 30/05/07, Maciej Stachowiak [EMAIL PROTECTED] wrote: On May 30, 2007, at 2:02 AM, Julian Reschke wrote: So let's rephrase this question: will there be a conformance class for HTML5 consumers that *only* accept conforming documents? (Keep in mind that these consumers may not even have a DOM or a Javascript engine). Do you mean: (A) only documents that meet all document conformance criteria (B) only documents that meet all *machine-checkable* conformance criteria or (C) documents that would not trigger any parse errors if the parsing algorithm were applied? Perhaps it would be better to rephrase as: Will there be a conformance class for HTML5 consumers that process conforming documents according the spec, but process non-conforming documents in an undefined way? (Some non-conforming documents might still be processed according to the spec, instead of being rejected, so it doesn't *only* accept conforming documents. That makes it not be impossible, when using the full definition of conformance.) At least that's how I interpret the original intent - it means tools in systems with guaranteed document conformance (i.e. not taking input from the general web) could be simplified while still claiming to be conformant and still being interoperable with other such tools. They would only have to be compatible with the rules for processing conforming documents, instead of being compatible with the rules defined by browsers for non-conforming documents. (Is that interpretation correct, or am I totally missing the point?) (I'm not sure whether it's that useful to be able to claim conformance for its own sake. Interoperability is useful, but maybe that can be achieved by imagining a new spec which just says If a document is conforming according to the definition in HTML5, then it must be processed as described in HTML5, otherwise the document should be rejected but anything may happen and all the tools can follow that, so there's no need for HTML5 itself to explicitly allow that.) (Keep in mind that these consumers may not even have a DOM or a Javascript engine). http://www.whatwg.org/specs/web-apps/current-work#non-scripted already defines UA conformance when there's no scripting, which seems to cover those cases. -- Philip Taylor [EMAIL PROTECTED]