Re: [whatwg] Reserving id attribute values?
On Tue, 19 May 2009, Brett Zamir wrote: In order to comply with XML ID requirements in XML, and facilitate future transitions to XML, can HTML 5 explicitly encourage id attribute values to follow this pattern (e.g., disallowing numbers for the starting character)? Why can't we just change the XML ID requirements in XML to be less strict? Also, there is this minor errata: http://www.whatwg.org/specs/web-apps/current-work/#refsCSS21 is broken (in section 3.2) I haven't done any references yet; I'll probably get to them in a couple of months. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Cross-domain databases; was: File package protocol and manifest support?
On Wed, 20 May 2009, Brett Zamir wrote: I would like to suggest an incremental though I believe significant enhancement to Offline applications/SQLite. That is, the ability to share a complete database among offline applications according to the URL from which it was made available. [...] On Tue, 19 May 2009, Drew Wilson wrote: One alternate approach to providing this support would be via shared, cross-domain workers (yes, workers are my hammer and now everything looks like a nail :) - this seems like one of the canonical uses of cross-domain workers, in fact. This would be potentially even more secure than a simple shared database, as it would allow the application to programmatically control access from other domains, synchronize updates, etc while allowing better controls over access (read-only, write via specific exposed write On Wed, 20 May 2009, Rob Kroeger wrote: For what it's worth, this was my immediate thought as well upon reading the idea. The database is insufficiently fast on some platforms to server as an IPC mechanism and there are practical limitations with having too many contending transactions so my instinct would be to build large integrated web apps with a shared worker routing data between components. On Thu, 28 May 2009, Michael Nordman wrote: I buy this thinking too as a better strategy for integrating web apps. Based on the above comments, I haven't added the requested feature at this time -- let's see if the existing features can be used to do it first. On Thu, 28 May 2009, Michael Nordman wrote: But still, the ability to download a fully formed SQL database, and then run SQL against it would be nice. openDatabaseFromURL(urlToDatabaseFile); * downloads the database file if needed (per http cache control headers) * the database can reside in an appcache (in which case it would be subject to appcache'ing rules instead) * returns a read-only database object Of course, there is the issue of the SQL database format. On Thu, 28 May 2009, Anne van Kesteren wrote: Would there be a lot of overhead in just doing this through XMLHttpRequest, some processing, and the database API? On Thu, 28 May 2009, Michael Nordman wrote: Good question. I think you're suggesting... * statementsToCreateAndPopulateSQLDatabase = httpGet(); * foreach(statement in above) { execute(statement); } * now you get to run queries of interest Certainly going to use more client-side CPU than downloading a fully formed db file. I think the download size would greater (all of the 'INSERT into' text overhead), but thats just a guess. A database containing FTS tables would change things a bit too (even less client-side cpu, but more download size). On Fri, 29 May 2009, Anne van Kesteren wrote: There are certainly drawbacks, but given that we still haven't nailed all the details of the database API proposal discussed by the WebApps WG (e.g. the SQL syntax) and given that it has not been deployed widely, it seems somewhat premature to start introducing convenient APIs around it that introduce a significant amount of complexity themselves. Defining the rules for parsing and creating a raw database file in a secure way is a whole new layer of issues and the gain seems small. On Fri, 29 May 2009, Michael Nordman wrote: I don't think this feature's time has come yet either. Just food for thought. I guess we'll wait on this for now. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] HTML5 ruby spec: rp
On Tue, 19 May 2009, Roland Steiner wrote: As I am currently in the process of writing an implementation for ruby, I was wondering about the constraints put on the content of the rp element in the spec: If the rp http://dev.w3.org/html5/spec/Overview.html#the-rp-elementelement is immediately after an rt http://dev.w3.org/html5/spec/Overview.html#the-rt-element element that is immediately preceded by another rphttp://dev.w3.org/html5/spec/Overview.html#the-rp-elementelement: a single character from Unicode character class Pe. Otherwise: a single character from Unicode character class Ps.Is there a specific reason that rp is constrained in this way? I imagine that someone could want to add additional spaces before/after the parenthesis, non-parenthesis separators, or, e.g., in a text book write: *ruby**rp *(reading:*/rprt*Kanji*/rt**rp*) */rpruby * Also note that there isn't such a constraint if one would use CSS rules to achieve a similar result (in the absence of proper ruby rendering): rt:before { content: (reading: ; } rt:after { content: ) ; } Yeah, I guess this constraint is excessive. I've removed it. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] Expose event.dataTransfer.files accessor to allow file drag and drop
SUMMARY Currently the input element exposes selected files via a files accessor (i.e., as in https://developer.mozilla.org/En/NsIDOMFileList and http://www.w3.org/TR/file-upload/). We should add a similar accessor to event.dataTransfer to enable drag-and-drop of files onto web pages. USE CASE When interacting with a webmail site, users would like to be able to attach files to email messages by dragging them onto the browser's content area, as in desktop email clients. My understanding is that this is one of the top user requests for Gmail, for example. WORK AROUNDS Currently, webmail sites work around this limitation by using the fugly input type=file control or by using a plug-in, such as Flash, to handle file uploads. Other sites, such as Flickr, work around this limitation by asking users to download an EXE (i.e., the Flickr uploader) that handles file drag-and-drop. PROPOSAL When the user drags-and-drops files onto a web page, we should expose those files to the page via a files accessor on the dataTransfer property of the event object. This feature is consistent with HTML 5's security model for drag and drop. There are a number of different API choices, but this appears to be the cleanest and most consistent with the future of web pages interacting with local files. Alternative APIs include getData('File'), as defined in http://msdn.microsoft.com/en-us/library/ms537658(VS.85).aspx. However, it does not appear that IE ever implemented this API. (Also, note that IE doesn't follow HTML 5's clipboard security model.) Mozilla has an equivalent API in event.dataTransfer.mozGetDataAt(application/x-moz-file, 0). Exposing the files attribute is better than these alternatives because it lets the web page get an object of type File, which can then be handed off to a future version of XMLHttpRequest, as in xhr.send(file), without synchronous access to the disk. IMPLEMENTATION WebKit has an experimental implementation of this API in https://bugs.webkit.org/show_bug.cgi?id=25916. Adam
Re: [whatwg] Expose event.dataTransfer.files accessor to allow file drag and drop
On Wed, 10 Jun 2009 10:37:03 +0200, Adam Barth wha...@adambarth.com wrote: SUMMARY Currently the input element exposes selected files via a files accessor (i.e., as in https://developer.mozilla.org/En/NsIDOMFileList and http://www.w3.org/TR/file-upload/). We should add a similar accessor to event.dataTransfer to enable drag-and-drop of files onto web pages. This is indeed very cool, but http://www.w3.org/TR/file-upload/ is very unstable (and from 2006!) so it seems that would have to be settle a bit more first. At the very minimum a shared understanding of what interfaces we want to provide to deal with files. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Helping people seaching for content filtered by license
On Fri, May 8, 2009 at 9:57 PM, Ian Hickson i...@hixie.ch wrote: [...] This has some implications: - Each unit of content (recipe in this case) must have its own independent page at a distinct URL. This is actually good practice anyway today for making content discoverable from search engines, and it is compatible with what people already do, so this seems fine. This is, on a wide range of cases, entirely impossible: while it might work, and maybe it's even good practice, for contents that can be represented on the web as a HTML document, it is not achievable for many other formats. Here are some obvious cases: Pictures (and other media) used on a page: An author might want to have protected content, but to allow re-use of some media under certain licenses. A good example of this are online media libraries, which have a good deal of media available for reuse but obviously protect the resources that inherently belong to the site (such as the site's own logo and design elements): Having a separate page to describe each resource's licensing is not easily achievable, and may be completelly out of reach for small sites that handle all their content by hand (most prominently, desginer's portfolio sites that offer many of their contents under some attribution license to promote their work). Software: I have stated this previously, but here it goes again: just like with media, it's impossible to simply put a link rel=license... on a msi package or a tarball. Sure, the package itself will normally include a file with the text of the corresponding license(s), but this doesn't help on making the licensing discoverable by search engines and other forms of web crawlers. It looks like I should make a page for each of the products (or even each of the releases), so I can put the link tag there and everybody's happy... actually, this makes so much sense that I actually already have such pages for each of my release (even if there aren't many as of now); but I *can't* put the link on them, because my software is under more liberal licenses (mostly GPL) than other elements of the page (such as the site's logo, appearing everywhere on the page, which is CC-BY-NC-ND), and I obviously don't want such contents to appear on searches for images that I can modify and use commercially, for example. Until now, the best way to approach this need I have seen would be RDF's triple concept: instead of saying licensed under Y, I'm trying to say X is licensed under Y, and maybe also and X2 is licensed under Y2, and this is inherently a triple. I am, however, open to alternatives (at least on this aspect), as long as they provide any benefit other than mere vadilation (which I don't even care about anymore, btw) over currently deployed and available solutions. I am not sure whether Microdata can handle this case or not (after all, it is capable of expressing some RDF triples), but the fact is that I can make my content discoverable by google and yahoo using CCREL (quite suboptimal, and wouldn't validate on HTML5, but would still work), but I can't do so using Microdata (which is also suboptimal, would validate on HTML5, but doesn't work anywhere yet). Regards, Eduard Pascual
Re: [whatwg] on bibtex-in-html5
On Wed, 20 May 2009, Bruce D'Arcus wrote: Re: the recent microdata work and the subsequent effort to include BibTeX in the spec, I summarized my argument against this on my blog: http://community.muohio.edu/blogs/darcusb/archives/2009/05/20/on-the-inclusion-of-bibtex-in-html5 | 1. BibTeX is designed for the sciences, that typically only cite |secondary academic literature. It is thus inadequate for, nor widely |used, in many fields outside of the sciences: the humanities and law |being quite obvious examples. For this reason, BibTeX cannot by |default adequately represent even the use cases Ian has identified. |For example, there are many citations on Wikipedia that can only be |represented using effectively useless types such as misc and which |require new properties to be invented. We will probably have to increase the coverage in due course, yes. However, we should verify that the mechanism works in principle before investing the time to extend the vocabulary. | 2. Related, BibTeX cannot represent much of the data in widely used |bibliographic applications such as Endnote, RefWorks and Zotero except |in very general ways. If such data is important, we can always add support when this becomes clear. | 3. The BibTeX extensibility model puts a rather large burden on inventing |new properties to accommodate data not in the core model. For example, |the core model has no way to represent a DOI identifier (this is no |surprise, as BibTeX was created before DOIs existed). As a |consequence, people have gradually added this to their BibTeX records |and styles in a more ad hoc way. This ad hoc approach to extensibility |has one of two consequences: either the vocabulary terms are |understood as completely uncontrolled strings, or one needs to |standardize them. If we assume the first case, we introduce potential |interoperability problems. If we assume the second, we have an |organizational and process problem: that the WHATWG and/or the |W3C-neither of which have expertise in this domain-become the |gate-keepers for such extensions. In either case, we have a rather |brittle and anachronistic approach to extension. I don't see any of this as a problem. | 4. The BibTeX model conflicts with Dublin Core and with vCard, both of |which are quite sensibly used elsewhere in the microdata spec to |encode information related to the document proper. There seems little |justification in having two different ways to represent a document |depending on whether on it is THIS document or THAT document. I don't understand this point. Could you provide an example of this conflict? | 5. Aspects of BibTeX's core model are ambiguous/confusing. For example, |what number does number refer to? Is it a document number, or an |issue number? What's the difference? Why does it matter? | My suggestion instead? | 1. reuse Dublin Core and vCard for the generic data: titles, |creators/contributors, publisher, dates, part/version relations, etc., |and only add those properties (volume, issue, pages, editors, etc.) |that they omit This seems unduly heavy duty (especially the use of vCard for author names) when all that is needed is brief bibliographic entries. | 2. typing should NOT be handled a bibtex-type property, but the same way |everything else is typed in the microdata proposal: a global |identifier Why? | 3. make it possible for people to interweave other, richer, vocabularies |such as bibo within such item descriptions. In other words, extension |properties should be URIs. This is already possible. | 4. define the mapping to RDF of such an item description; can we say, |for example, that it constitutes a dct:references link from the |document to the described source? The mapping to RDF is already defined; further mappings can be done using the sameAs mechanism. On Thu, 21 May 2009, Henri Sivonen wrote: The set of fields is more of an issue, but it can be fixed by inventing more fields--it doesn't mean the whole base solution needs to be discarded. Fortunately, having custom fields in .bib doesn't break existing pre-Web, pre-ISBN bibliography styles. I've used at least these custom fields: key: Show this citation pseudo-id in rendering instead of the actual id used for matching. url: The absolute URL of a resource that is on the Web. refdate: The date when the author made the reference to an ephemeral source such as a Web page. isbn: The ISBN of a publication. stdnumber: RFC or ISO number. e.g. RFC 2397 or ISO/IEC 10646:2003(E) Particularly the 'url' and 'isbn' field names should be obvious and uncontroversial additions. url seems widely supported and I included it. I haven't added any other fields yet; I imagine that once this feature gets traction, we'll have more direct data as to which fields would be most useful, and
Re: [whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)
First of all, Ian, thank for your reply. I appreciate any opinions on this subject. On Wed, Jun 10, 2009 at 1:29 AM, Ian Hicksoni...@hixie.ch wrote: This proposal is very similar to RDF EASE. Indeed, they are both CSS-based, and they fulfill similar purposes. Let me, however, highlight some differences: 1st, EASE is tighly bound to RDFa. However, RDFa is meant for embeeding metadata, and was built with that purpose on mind; while EASE is meant for linked metadata, so builiding it on top of RDFa's embeeding constructs is quite unnatural. In contrast, CRDF is build from CSS's syntax and RDF's (not RDFa's) concepts: it only shares with RDFa what they both inherit from RDF: the concepts and data model. 2nd, EASE is meant to be complimentary to RDFa: they address (or attempt to address) different use cases / needs (embeeding vs. linking). On the other hand (more on this below), CRDF attempts to address both cases, plus the case where an hybrid approach is appropriate (inlining some metadata, and linking other). While I sympathise with the goal of making semantic extraction easier, I feel this approach has several fundamental problems which make it inappropriate for the specific use cases that were brought up and which resulted in the microdata proposal: * It separates (by design) the semantics from the data with those semantics. That's not accurate. CRDF *allows* separating the semantics, but doesn't require to do so. Everything could be inlined, and the possibility of separation is just for when it is needed. I think this is a level of indirection too far -- when something is a heading, it should _be_ a heading, it shouldn't be labeled opaquely with a transformation sheet elsewhere defining that is maps to the heading semantic. That doesn't make much sense. When something is a heading, it *is* a heading. What do you mean by should be a heading?. CRDF (as well as many other syntaxes for RDF) allow parsers that don't know the specific semantics of the markup language to find out that something is actually a heading anyway; and allows expressing semantics that the markup language has no direct support for (for example, is it a site-section heading? a news heading? an iguana's name (used as the main title for each iguana's page on the iguana collection example)? something else?). * It is even more brittle in the face of copy-and-paste and regular maintenance than, say, namespace prefixes. It is very easy to forget to copy the semantic transformation rules. It is very easy to edit the document such that the selectors no longer match what they used to match. It's not at all obvious from looking at the page that there are semantics there. I think the whole copy-paste thing should be broken on two separate scenarios: Copy-pasting source code: with the next version of the document (which I'm already cleaning up, and will allow @namespace rules inside the inlining attribute), this will be as brittle (and as resillient) as prefixes are: when a fragment that includes the @namespaces or prefixes it needs is copy-pasted, it will work as expected; OTOH, if a rule relies on a namespace that is not available (declared outside of the copy-pasted fragment), the rule will just be ignored. The risk of the copied code clashing with declarations on its new location is lower than it may seem: an author who is already adding CRDF code to his pages is quite likely to review the code he's copying for the semantics that may be there; and authoring tools that automatically add semantic code should review whether things make sense or not when pasting code on them (for example, invalid/redundant properties could/should be notified to the author). Copy-pasting content: currently, browser support for copy-pasting CSS styled content is mediocre and inconsistent (some browsers do it right, some don't, some don't even try), but this is already more than what is supported for RDFa, Microdata, or other semantic formats. With a bit of luck, pressure for browsers to include CRDF properties when copying content could help to get decent support for CSS properties as well (since most of the code for these tasks would be shared). * It relies on selectors to do something subtle. Authors have a great deal of trouble understanding selectors -- if you watch a typical Web authors writing CSS, he will either use just class selectors, or he will write selectors by trial and error until he gets the style he wants. This isn't fatal for CSS because you can see the results right there; for something as subtle as semantic data mining, it is extremely likely that authors will make mistakes that turn their data into garbage, which would make the feature impractical for large-scale use. It relies on selectors to do what they do: select things. Nobody is *asking* authors to make use of over-complicated selectors for each piece of metadata they want to add; but CRDF tries to *allow* using any valid
Re: [whatwg] on bibtex-in-html5
Ian Hickson wrote: ... So far based on my experience with the Workers, Storage, Web Sockets, and Server-sent Events sections, I'm not convinced that the advantage of getting more review is real. Those sections in particular got more review while in the HTML5 spec proper than they have since. ... So you are putting stuff you're personally interested in into the HTML5 spec, so that people read it? What a cunning plan. BR, Julian
Re: [whatwg] Expose event.dataTransfer.files accessor to allow file drag and drop
On Wed, Jun 10, 2009 at 10:37 AM, Adam Barth wrote: SUMMARY Currently the input element exposes selected files via a files accessor (i.e., as in https://developer.mozilla.org/En/NsIDOMFileList and http://www.w3.org/TR/file-upload/). We should add a similar accessor to event.dataTransfer to enable drag-and-drop of files onto web pages. [...] Alternative APIs include getData('File'), as defined in http://msdn.microsoft.com/en-us/library/ms537658(VS.85).aspx. However, it does not appear that IE ever implemented this API. (Also, note that IE doesn't follow HTML 5's clipboard security model.) Mozilla has an equivalent API in event.dataTransfer.mozGetDataAt(application/x-moz-file, 0). It should be noted also that Adobe AIR has getData(application/x-vnd.adobe.air.file-list) [1] and Gears (starting with 0.5.21.0, as announced at Google I/O) has its own (not yet documented) API with a files property [2] (as requested here). [1] http://help.adobe.com/en_US/AIR/1.5/devappshtml/WS7709855E-7162-45d1-8224-3D4DADC1B2D7.html [2] http://code.google.com/p/gears/source/browse/trunk/gears/test/manual/drag_and_drop.html#109 -- Thomas Broyer
Re: [whatwg] on bibtex-in-html5
Am cc-ing he Zoteor dev list just for posterity ... On Wed, Jun 10, 2009 at 5:44 AM, Ian Hicksoni...@hixie.ch wrote: On Wed, 20 May 2009, Bruce D'Arcus wrote: Re: the recent microdata work and the subsequent effort to include BibTeX in the spec, I summarized my argument against this on my blog: http://community.muohio.edu/blogs/darcusb/archives/2009/05/20/on-the-inclusion-of-bibtex-in-html5 | 1. BibTeX is designed for the sciences, that typically only cite | secondary academic literature. It is thus inadequate for, nor widely | used, in many fields outside of the sciences: the humanities and law | being quite obvious examples. For this reason, BibTeX cannot by | default adequately represent even the use cases Ian has identified. | For example, there are many citations on Wikipedia that can only be | represented using effectively useless types such as misc and which | require new properties to be invented. We will probably have to increase the coverage in due course, yes. However, we should verify that the mechanism works in principle before investing the time to extend the vocabulary. No; you should drop this proposal and move it to an experimental annex. If you do insist, against all reason, in pushing forward with this without modification, then I suggest you explain how this process of extension will work. If, as I suspect, it'll be another case of a centralized authority (you; who have admitted you really know nothing about this space), then that's a deal-breaker from my perspective. | 2. Related, BibTeX cannot represent much of the data in widely used | bibliographic applications such as Endnote, RefWorks and Zotero except | in very general ways. If such data is important, we can always add support when this becomes clear. Man this is frustrating. | 3. The BibTeX extensibility model puts a rather large burden on inventing | new properties to accommodate data not in the core model. For example, | the core model has no way to represent a DOI identifier (this is no | surprise, as BibTeX was created before DOIs existed). As a | consequence, people have gradually added this to their BibTeX records | and styles in a more ad hoc way. This ad hoc approach to extensibility | has one of two consequences: either the vocabulary terms are | understood as completely uncontrolled strings, or one needs to | standardize them. If we assume the first case, we introduce potential | interoperability problems. If we assume the second, we have an | organizational and process problem: that the WHATWG and/or the | W3C-neither of which have expertise in this domain-become the | gate-keepers for such extensions. In either case, we have a rather | brittle and anachronistic approach to extension. I don't see any of this as a problem. The problem, to repeat myself again, is related to the above we'll extend it as we see fit issue. The two biggest problems in bibtex are two properties: book journal They're a problem because they're both horribly concrete/narrow, and (arguably) redundant. If those were instead replaced with something more generic like either: 1) publication-title ... or, better yet ... 2) a nested/related object (call it publication or container or isPartOf) ... then extension becomes easier. If I need to encode a newspaper article, then I just do: title = Some Article publication-title = Some Newspaper .. or (better, because I can attach other information to the container): title = Some Article publication = [ title = Some Newspaper ] As is, you need to add stuff like this just to resolve the problems I've repeayedly pointed out: newspaper-title magazine-title court-reporter-title television-program-title radio-program-title Aside: of course, some of the above could be collapsed into more generic stuff like broadcast-title, but I'm just following the same, broken, approach as bibtex. This stuff isn't theoretical Ian. Just look through this wikipedia page, for example: http://en.wikipedia.org/wiki/Guantanamo_Bay_detention_camp The citations include references to legal cases and briefs, and news articles (television, radio and print). Your proposal doesn't cover this stuff. OTOH, applications like Zoteor can. | 4. The BibTeX model conflicts with Dublin Core and with vCard, both of | which are quite sensibly used elsewhere in the microdata spec to | encode information related to the document proper. There seems little | justification in having two different ways to represent a document | depending on whether on it is THIS document or THAT document. I don't understand this point. Could you provide an example of this conflict? Here's an academic article in an open access biology journal. http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.182 THIS article refers to the metadata about the document proper, with the title Accelerated Adaptive Evolution
Re: [whatwg] Reserving id attribute values?
2009/6/10 Ian Hickson i...@hixie.ch: On Tue, 19 May 2009, Brett Zamir wrote: In order to comply with XML ID requirements in XML, and facilitate future transitions to XML, can HTML 5 explicitly encourage id attribute values to follow this pattern (e.g., disallowing numbers for the starting character)? Why can't we just change the XML ID requirements in XML to be less strict? Because you are not part of the XMLCore WG, because XML is a Recommendation and because ID has been a Name from the very beginning of SGML. If something should be changed, it is the HTML5 draft. Naturally it should be only an author conformance requirement. Also, there is this minor errata: http://www.whatwg.org/specs/web-apps/current-work/#refsCSS21 is broken (in section 3.2) I haven't done any references yet; I'll probably get to them in a couple of months. -- Ian Hickson U+1047E )\._.,--,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' Giovanni
Re: [whatwg] on bibtex-in-html5
| 1. BibTeX is designed for the sciences, that typically only cite | Â Â secondary academic literature. It is thus inadequate for, nor widely | Â Â used, in many fields outside of the sciences: the humanities and law | Â Â being quite obvious examples. For this reason, BibTeX cannot by | Â Â default adequately represent even the use cases Ian has identified. | Â Â For example, there are many citations on Wikipedia that can only be | Â Â represented using effectively useless types such as misc and which | Â Â require new properties to be invented. We will probably have to increase the coverage in due course, yes. However, we should verify that the mechanism works in principle before investing the time to extend the vocabulary. No; you should drop this proposal and move it to an experimental annex. If you do insist, against all reason, in pushing forward with this without modification, then I suggest you explain how this process of extension will work. If, as I suspect, it'll be another case of a centralized authority (you; who have admitted you really know nothing about this space), then that's a deal-breaker from my perspective. Related to this I want to remark some things on a more general level: We currently experience major changes in the world of bibliographic software. At least, this is how I experience it. After years of limited and/or closed formats and models like BibTeX or Endnote we finally see new models like CSL or biblatex emerging which try to learn from the lessons from the past. Of course, I do not know how things will evolve, but looking at the success of solutions like Zotero I think it's not so bold to say that things will change quite a bit in the coming years. And then we have HTML5, an emerging standard which is now getting support by the newest and latest browsers. I do know even less how HTML5 will evolve, what impact it will have on the web. But it's probably fair to say that widespread adoption of HTML5 will not happen overnight. Honestly, I really don't get why a coming web standard should support a bibliographic standard which is obviously outdated. The fact that BibTeX is widely used is really a non argument, because if we follow this logic we wont have any development. By the same logic you should avoid something like video after all, there isn't any support for it *yet*. If HTML5 wants to be forward-looking, it certainly shouldn't adopt a twenty years old standard but should instead try to support something new which is really up to date and has chance if being useful in the future. simon
Re: [whatwg] Helping people seaching for content filtered by license
On Wed, Jun 10, 2009 at 3:46 AM, Eduard Pascualherenva...@gmail.com wrote: On Fri, May 8, 2009 at 9:57 PM, Ian Hickson i...@hixie.ch wrote: [...] This has some implications: - Each unit of content (recipe in this case) must have its own independent page at a distinct URL. This is actually good practice anyway today for making content discoverable from search engines, and it is compatible with what people already do, so this seems fine. This is, on a wide range of cases, entirely impossible: while it might work, and maybe it's even good practice, for contents that can be represented on the web as a HTML document, it is not achievable for many other formats. Here are some obvious cases: Pictures (and other media) used on a page: An author might want to have protected content, but to allow re-use of some media under certain licenses. A good example of this are online media libraries, which have a good deal of media available for reuse but obviously protect the resources that inherently belong to the site (such as the site's own logo and design elements): Having a separate page to describe each resource's licensing is not easily achievable, and may be completelly out of reach for small sites that handle all their content by hand (most prominently, desginer's portfolio sites that offer many of their contents under some attribution license to promote their work). Even on small sites, though, if they have a picture gallery they almost certainly have the ability to view each picture individually as well, usually by clicking on the picture itself. That's the page you'd put the license information on. I think it's fundamentally rare to have a bunch of resources that (a) *only* exist grouped together on a single page, and (b) need different licenses. Software: I have stated this previously, but here it goes again: just like with media, it's impossible to simply put a link rel=license... on a msi package or a tarball. Sure, the package itself will normally include a file with the text of the corresponding license(s), but this doesn't help on making the licensing discoverable by search engines and other forms of web crawlers. It looks like I should make a page for each of the products (or even each of the releases), so I can put the link tag there and everybody's happy... actually, this makes so much sense that I actually already have such pages for each of my release (even if there aren't many as of now); but I *can't* put the link on them, because my software is under more liberal licenses (mostly GPL) than other elements of the page (such as the site's logo, appearing everywhere on the page, which is CC-BY-NC-ND), and I obviously don't want such contents to appear on searches for images that I can modify and use commercially, for example. As Ian stated, link rel=license does *not* mean This entire page is covered under the linked license, but rather The primary content of this page is covered under the linked license. This is different from preliminary definitions of rel=license, but it's how it is overwhelmingly used in practice, and so HTML5 redefined it to match. So, since you already create separate pages for each release, you're completely fine. ^_^ Until now, the best way to approach this need I have seen would be RDF's triple concept: instead of saying licensed under Y, I'm trying to say X is licensed under Y, and maybe also and X2 is licensed under Y2, and this is inherently a triple. I am, however, open to alternatives (at least on this aspect), as long as they provide any benefit other than mere vadilation (which I don't even care about anymore, btw) over currently deployed and available solutions. I am not sure whether Microdata can handle this case or not (after all, it is capable of expressing some RDF triples), but the fact is that I can make my content discoverable by google and yahoo using CCREL (quite suboptimal, and wouldn't validate on HTML5, but would still work), but I can't do so using Microdata (which is also suboptimal, would validate on HTML5, but doesn't work anywhere yet). Of course microdata can handle it. Assuming a theoretical Microdata vocab for Creative Commons, you can do it with: div item div itemprop=cc.work foo... /div a itemprop=cc.license href=http://creativecommons.org/license/cc-gpl;This work is licensed under the GNU GPL, version 3 or later/a /div (You can also separate the license markup from your work by slapping an id on your work and using @subject on the license link.) Remember, Microdata and RDF are essentially identical in nearly all realistic cases, with only a few small differences - namely that Microdata forms a tree structure rather than a more general graph. That's rarely relevant, however, and nearly all common metadata annotations can be done just fine as a tree. Though, of course, as long as your work was the primary content of the page, you can skip Microdata entirely and
Re: [whatwg] [html5] r3218 - [] (0) Mention frameset event handler attributes (they work like body's apparently)
On Wed, 10 Jun 2009 10:31:54 +0200, wha...@whatwg.org wrote: Author: ianh Date: 2009-06-10 01:31:52 -0700 (Wed, 10 Jun 2009) New Revision: 3218 Modified: index source Log: [] (0) Mention frameset event handler attributes (they work like body's apparently) + pIn addition, codeframeset/code elements must implement the + following interface:/p + + pre class=idlinterface dfnHTMLFramesetElement/dfn : spanHTMLElement/span { Should be HTMLFrameSetElement. rows and cols should probably be in the interface, too. While you're at it, you could specify HTMLFrameElement. Maybe there are other interfaces or members that are currently lacking. -- Simon Pieters Opera Software
Re: [whatwg] Helping people seaching for content filtered by license
On Wed, Jun 10, 2009 at 9:19 AM, Tab Atkins Jr.jackalm...@gmail.com wrote: On Wed, Jun 10, 2009 at 3:46 AM, Eduard Pascualherenva...@gmail.com wrote: On Fri, May 8, 2009 at 9:57 PM, Ian Hickson i...@hixie.ch wrote: [...] This has some implications: - Each unit of content (recipe in this case) must have its own independent page at a distinct URL. This is actually good practice anyway today for making content discoverable from search engines, and it is compatible with what people already do, so this seems fine. This is, on a wide range of cases, entirely impossible: while it might work, and maybe it's even good practice, for contents that can be represented on the web as a HTML document, it is not achievable for many other formats. Here are some obvious cases: Pictures (and other media) used on a page: An author might want to have protected content, but to allow re-use of some media under certain licenses. A good example of this are online media libraries, which have a good deal of media available for reuse but obviously protect the resources that inherently belong to the site (such as the site's own logo and design elements): Having a separate page to describe each resource's licensing is not easily achievable, and may be completelly out of reach for small sites that handle all their content by hand (most prominently, desginer's portfolio sites that offer many of their contents under some attribution license to promote their work). Even on small sites, though, if they have a picture gallery they almost certainly have the ability to view each picture individually as well, usually by clicking on the picture itself. That's the page you'd put the license information on. What about the case where you have a JS-based viewer, and so when the user clicks a photo, they do not go to a separate page, but instead get a pop-up viewer? Surely that's common, and it's entirely feasible that different photos on the page would have different licenses. Or another case: a weblog that includes third-party photo content (could be your own photos too). You want to label your blog text with one license, and the linked photos with another. ... Bruce
Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?
- Original Message From: Ian Hickson i...@hixie.ch To: Brett Zamir bret...@yahoo.com Cc: wha...@whatwg.org Sent: Wednesday, June 10, 2009 11:48:09 AM Subject: Re: [whatwg] DOM3 Load and Save for simple parsing/serialization? On Mon, 18 May 2009, Brett Zamir wrote: Has any thought been given to standardizing on at least a part of DOM Level 3 Load and Save in HTML5? DOM3 Load and Save is already standardised as far as I can tell. I don't see why HTML5 would have to say anything about it. The hope was that there would be some added impetus to have browsers settle on a standard way of doing this, since to my knowledge, it looks to me like only Opera has implemented DOM Level 3 LS (Mozilla for one hasn't seemed keen on implementing it), and I'm afraid this otherwise very important functionality will remain unimplemented or unstandardized across browsers. DOMParser() and XMLSerializer() may be available in more than just Mozilla, but are not standardized, and innerHTML, along with the other issues Boris mentioned in the DOMParser / XMLSerializer thread (e.g., being able to parse by content type like XML), just doesn't sound appropriate to handle plain XML document when its called innerHTML. thanks, Brett
Re: [whatwg] Feedbacl
On Tue, Jun 9, 2009 at 10:59 PM, Mike Weissenbornmike_weissenb...@yahoo.com wrote: 1) I've used frames in many web pages, and I see this is being dropped. I typically have a selection frame and a result frame. So links clicked on in the 1st frame show up in the second frame. I then never have to worry about managing what's in each frame. For many pages I can likely use a block element like a DIV, but my ISP has size limitations and I have spread my pages onto several sites. I have no problems switching to something else but I didn't see anything in the specs except opening a new window to accomplish this. If something else is being used, how will this be compatible with older browsers. In general, frames are bad. They break bookmarking, are horrible for accessibility, and don't do anything that can't be accomplished relatively easily in better ways. For your particular use-case, it appears you have a really weird ISP. I'd suggest leaving them and getting a competent one. ^_^ I can suggest some privately if you'd like. Your current setup will be next to impossible to do properly. It seems like what you're doing currently is putting common 'site' navigation in one frame, and page contents in another. Generally the way this is done without frames is to use some server-side language (PHP, etc.) to 'build' your pages, combining one or two 'templates' files with a 'content' file so that it makes a full page. That way you can modify the template files once and have the change reflected across the entire site automatically. This is usually rather trivial to implement, and in the end you have a page with none of the problems that frames does. 2) I am perhaps one of the few I know to use Xforms and I am excited about being able to have like capabilities in all browsers. The implementation image I saw looked somewhat different and didn't really describe what new, changed or obsolete. Personally I want the same capabilities of Xforms; being able to save locally, FTP. or URL and this wasn't really identified. I don't mind having to make change, I just wamt it to work. Still on Xforms I would like additional functionality, I think you may have dealt with, is being able to reformat/reorder the data via CSS or a datagrid to a format the user wishes the data to be viewwed in. Obviously this may be define via code, but I'm hoping the WebForms implementation will allow for things such as sortable columns,, re-order columns, hide/show columns... I don't know if the subject of data binding has ever come up. I like the data binding in IE, however other browsers don't support this ability and I have to use binding in IE and Xforms for firefox. I would really benifit from being able to use the same code for both. I did notice a Local Storage componenet, which I hope some consistent client call can be done to Post or Sync these to a URL... Don't know much about XForms, and haven't had cause to look into them, so can't help you here. 3) Xforms or not, I hope anything displayable can be formated appropriately using CSS. There seems to be many browser specific formating settings, is there a way to consolidate these with this release to iliminate or reduce browse specific CSS settings. There are very good reasons why no browser allows you full control over form element styling; namely, security. A few elements (mainly input type=file) must be handled very carefully to make sure they can't be abused, and allowing arbitrary styling pushes the door wide open. Regardless, though, this is a CSS issue, not an HTML issue, and so should be on the CSS mailing list. 4) not being able to implement #3, somehow within CSS it would then be nice to be able have some type of IF statement so additional CSS can be included or excluded for non-complient browsers... Even down the road, the ability to include/exclude imports based on broweser capabilities could benifit many. Unless defined, browser builders will continue to build there own settings. Im sure this is out of your control, but perhaps an IF isn't. I hate the idea of having to create a different presentation based on the browser, but how does one ever ensure someones browser is compatable or the content is dsiplayed appropriately. Again, CSS issue, not HTML. 5) On the CSS, I'm sure builders/browser developer would love an XML format. If there are no CSS format changes perhaps this can be identified as a future enhancement/direction. CSS seems to be areal oddball format compared to everything else. Been discussed (though possibly as an April Fools?). Doesn't seem to be any real reason to do this, other than that some people already have xml generators lying around and would like to use them. And once more, CSS issue, not HTML. ^_^ 6) I did see some comment about user defined variables in the FAQ. I see know reason why if enbed something called MIKE in an html file and the CSS attributes
Re: [whatwg] Helping people seaching for content filtered by license
Jeff Walden wrote: ... Maybe I'm the only person who thinks it (I'd like to hope I'm merely the only person to say it, unless I've missed its mention in the past), but this feels like mission creep to me. ... You're not the only person. BR, Julian
Re: [whatwg] Helping people seaching for content filtered by license
Leif Halvard Silli wrote: ... and there are a number of folks who disagree (not just us in RDFa), including at least two RECs (RDFa and GRDDL). Is this claim based on a mere comparison of the description of those link relations in said specifications? Perhaps some of the disagreements are merely a different wording? ... As a matter of fact I don't see RDFa using @profile. The point is: if you assume that @rel=foo always means the same thing, then many folks believe you're already violating the HTML spec, which specifically uses @profile to modulate the meaning of @rel, and sometimes via another level of indirection. Where does nottingham draft define anything that contradicts the default HTML 401 profile? Authors will often assume that rel=foo does means the same thing wherever it appears, hence a central register is a benefit so that specification writers and profile writers can know what the standard semantics are. The Web Linking draft does not override anything in HTML 4.01. It just states that generic link relations are a good idea, creates an IANA registry for them, and defines how to use them in the HTTP Link header. That being said I *do* believe that it's an incredibly bad idea on using the same relation name for different things. ... BR, Julian
Re: [whatwg] Helping people seaching for content filtered by license
On Wed, Jun 10, 2009 at 10:05 AM, Tab Atkins Jr.jackalm...@gmail.com wrote: ... What about the case where you have a JS-based viewer, and so when the user clicks a photo, they do not go to a separate page, but instead get a pop-up viewer? That is indeed a valid concern. The obvious way to address it is to have a permalink on the JS popup, which will send you to an individual page with that content where the license info is located. In this scenario the JS viewer is merely a convenience, allowing you to view the pictures without a page refresh, rather than a necessity. Hopefully that's true anyway, for accessibility reasons! Thus you get the best of both worlds - machine-readable data on the individual pages, and you can still put human-readable license info on the JS popup. But why can't one have the best of both worlds without having to go to separate pages for each photo? Surely that's common, and it's entirely feasible that different photos on the page would have different licenses. I don't think it's that common for different photos on the page to have different licenses (and preventing that scenario is just one more reason to fight license proliferation), but even if true it's covered by the above. Depends what you mean by covered. I'd say the RDFa examples of this cover it better in the sense that they don't impose an arbitrary restriction that the license only applies to a single object (or I suppose group of objects). Or another case: a weblog that includes third-party photo content (could be your own photos too). You want to label your blog text with one license, and the linked photos with another. This is indeed not covered by @rel=license. Is it necessary to embed the separate licensing information for the pictures in a machine-readable way? It seems that just putting a human-readable license link on each picture would work pretty well. This isn't really my area, but I could imagine an organization (in particular) wanting to include machine-readable license links (a la CC). Bruce
Re: [whatwg] Helping people seaching for content filtered by license
On Wed, Jun 10, 2009 at 9:37 AM, Bruce D'Arcusbdar...@gmail.com wrote: On Wed, Jun 10, 2009 at 10:05 AM, Tab Atkins Jr.jackalm...@gmail.com wrote: What about the case where you have a JS-based viewer, and so when the user clicks a photo, they do not go to a separate page, but instead get a pop-up viewer? That is indeed a valid concern. The obvious way to address it is to have a permalink on the JS popup, which will send you to an individual page with that content where the license info is located. In this scenario the JS viewer is merely a convenience, allowing you to view the pictures without a page refresh, rather than a necessity. Hopefully that's true anyway, for accessibility reasons! Thus you get the best of both worlds - machine-readable data on the individual pages, and you can still put human-readable license info on the JS popup. But why can't one have the best of both worlds without having to go to separate pages for each photo? Hopefully you have a separate page for each photo *anyway*. If you don't - that is, if you only have a thumbnails page, and then a js-based fullsize viewer - your page is pretty crappy in terms of accessibility and discoverability. Given that of course we all value making our pages accessible ^_^, the problem is already solved. The js-based viewer is merely a convenience for those that can use it, and license information can be embedded on the individual pages. Surely that's common, and it's entirely feasible that different photos on the page would have different licenses. I don't think it's that common for different photos on the page to have different licenses (and preventing that scenario is just one more reason to fight license proliferation), but even if true it's covered by the above. Depends what you mean by covered. I'd say the RDFa examples of this cover it better in the sense that they don't impose an arbitrary restriction that the license only applies to a single object (or I suppose group of objects). The restriction is far from arbitrary - it makes it dead-simple. Any solution that allows you to assign different licenses to various pieces of content on a single page in a machine-readable way is necessarily more complex. It's not apparent in these examples that anything more complex is necessary. Regardless, though, the situation is *indeed* covered. The fact that you can imagine a slightly different solution doesn't change the fact that existing markup is *a* solution, at least for any halfway decent site design. Or another case: a weblog that includes third-party photo content (could be your own photos too). You want to label your blog text with one license, and the linked photos with another. This is indeed not covered by @rel=license. Is it necessary to embed the separate licensing information for the pictures in a machine-readable way? It seems that just putting a human-readable license link on each picture would work pretty well. This isn't really my area, but I could imagine an organization (in particular) wanting to include machine-readable license links (a la CC). Can you illustrate this more plainly? ~TJ
Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?
On Wed, 10 Jun 2009 09:49:08 -0400, Brett Zamir bret...@yahoo.com wrote: - Original Message From: Ian Hickson i...@hixie.ch To: Brett Zamir bret...@yahoo.com Cc: wha...@whatwg.org Sent: Wednesday, June 10, 2009 11:48:09 AM Subject: Re: [whatwg] DOM3 Load and Save for simple parsing/serialization? On Mon, 18 May 2009, Brett Zamir wrote: Has any thought been given to standardizing on at least a part of DOM Level 3 Load and Save in HTML5? DOM3 Load and Save is already standardised as far as I can tell. I don't see why HTML5 would have to say anything about it. The hope was that there would be some added impetus to have browsers settle on a standard way of doing this, since to my knowledge, it looks to me like only Opera has implemented DOM Level 3 LS Opera's implementation is buggy. The async version never fires a load event, handling of errors is all messed up and some functions don't work. It's pretty much useless except for synchronous loading in perfect conditions. It seems that everyone wants DOM3 LS to die and to have everyone use JS to make their own wrapper around XHR + DOMParser + XMLSerializer etc. to do what DOM3 LS does. -- Michael
Re: [whatwg] Expose event.dataTransfer.files accessor to allow filedrag and drop
Microsoft has recently invented and deployed a custom ActiveX component to drop local files onto Live Spaces. This component is undocumented and it is probably limited to the Spaces service. Chris
Re: [whatwg] Helping people seaching for content filtered by license
A JavaScript-based viewer for images can overlay an image within an IFRAME and the IFRAME may contain the license link. HTH, Chris
Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?
On Wed, 10 Jun 2009 17:13:28 +0200, Michael A. Puls II shadow2...@gmail.com wrote: Opera's implementation is buggy. The async version never fires a load event, handling of errors is all messed up and some functions don't work. It's pretty much useless except for synchronous loading in perfect conditions. We should probably nuke it. It seems that everyone wants DOM3 LS to die and to have everyone use JS to make their own wrapper around XHR + DOMParser + XMLSerializer etc. to do what DOM3 LS does. Yeah, no need for two high-level network APIs. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Limit on number of parallel Workers.
That's a great approach. Is the pool of OS threads per-domain, or per browser instance (i.e. can a domain DoS the workers of other domains by firing off several infinite-loop workers)? Seems like having a per-domain thread pool is an ideal solution to this problem. -atw On Tue, Jun 9, 2009 at 9:33 PM, Dmitry Titov dim...@chromium.org wrote: On Tue, Jun 9, 2009 at 7:07 PM, Michael Nordman micha...@google.comwrote: This is the solution that Firefox 3.5 uses. We use a pool of relatively few OS threads (5 or so iirc). This pool is then scheduled to run worker tasks as they are scheduled. So for example if you create 1000 worker objects, those 5 threads will take turns to execute the initial scripts one at a time. If you then send a message using postMessage to 500 of those workers, and the other 500 calls setTimeout in their initial script, the same threads will take turns to run those 1000 tasks (500 message events, and 500 timer callbacks). This is somewhat simplified, and things are a little more complicated due to how we handle synchronous network loads (during which we freeze and OS thread and remove it from the pool), but the above is the basic idea. / Jonas Thats a really good model. Scalable and degrades nicely. The only problem is with very long running operations where a worker script doesn't return in a timely fashion. If enough of them do that, all others starve. What does FF do about that, or in practice do you anticipate that not being an issue? Webkit dedicates an OS thread per worker. Chrome goes even further (for now at least) with a process per worker. The 1:1 mapping is probably overkill as most workers will probably spend most of their life asleep just waiting for a message. Indeed, it seems FF has a pretty good solution for this (at least for non-multiprocess case). 1:1 is not scaling well in case of threads and especially in case of processes. Here http://figushki.com/test/workers/workers.html is a page that can create variable number of workers to observe the effects, curious can run it in FF3.5, in Safari 4, or in Chromium with '--enable-web-workers' flag. Don't click 'add 1000' button in Safari 4 or Chromium if you are not prepared to kill the unresponsive browser while the whole system gets half-frozen. FF continue to work just fine, well done guys :-) Dmitry
Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?
From: Anne van Kesteren ann...@opera.com To: Michael A. Puls II shadow2...@gmail.com; Brett Zamir bret...@yahoo.com; Ian Hickson i...@hixie.ch Cc: wha...@whatwg.org Sent: Thursday, June 11, 2009 12:31:10 AM Subject: Re: [whatwg] DOM3 Load and Save for simple parsing/serialization? On Wed, 10 Jun 2009 17:13:28 +0200, Michael A. Puls II shadow2...@gmail.com wrote: It seems that everyone wants DOM3 LS to die and to have everyone use JS to make their own wrapper around XHR + DOMParser + XMLSerializer etc. to do what DOM3 LS does. Yeah, no need for two high-level network APIs. That'd be fine by me if at least DOMParser + XMLSerializer was being officially standardized on... Brett
Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?
On Wed, 10 Jun 2009 20:36:30 +0200, Brett Zamir bret...@yahoo.com wrote: That'd be fine by me if at least DOMParser + XMLSerializer was being officially standardized on... See the separate thread on those objects. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Limit on number of parallel Workers.
On Tue, Jun 9, 2009 at 7:07 PM, Michael Nordmanmicha...@google.com wrote: This is the solution that Firefox 3.5 uses. We use a pool of relatively few OS threads (5 or so iirc). This pool is then scheduled to run worker tasks as they are scheduled. So for example if you create 1000 worker objects, those 5 threads will take turns to execute the initial scripts one at a time. If you then send a message using postMessage to 500 of those workers, and the other 500 calls setTimeout in their initial script, the same threads will take turns to run those 1000 tasks (500 message events, and 500 timer callbacks). This is somewhat simplified, and things are a little more complicated due to how we handle synchronous network loads (during which we freeze and OS thread and remove it from the pool), but the above is the basic idea. / Jonas Thats a really good model. Scalable and degrades nicely. The only problem is with very long running operations where a worker script doesn't return in a timely fashion. If enough of them do that, all others starve. What does FF do about that, or in practice do you anticipate that not being an issue? Webkit dedicates an OS thread per worker. Chrome goes even further (for now at least) with a process per worker. The 1:1 mapping is probably overkill as most workers will probably spend most of their life asleep just waiting for a message. We do see it as a problem, but not big enough of a problem that we needed to solve it in the initial version. It's not really a problem for most types of calculations, as long as the number of threads is larger than the number of cores we'll still finish all tasks as quickly as the CPU is able to. Even for long running operations, if it's operations that the user wants anyway, it doesn't really matter if the jobs are running all in parallel, or staggered after each other. As long as you're keeping all CPU cores busy. There are some scenarios which it doesn't work so well for. For example a worker that works more or less infinitely and produces more and more accurate results the longer it runs. Or something like a fold...@home website which performs calculations as long as the user is on a website and submits them to the server. If enough of those workers are scheduled it will block everything else. This is all solveable of course, there's a lot of tweaking we can do. But we figured we wanted to get some data on how people use workers before spending too much time developing a perfect scheduling solution. / Jonas
Re: [whatwg] Limit on number of parallel Workers.
On Wed, Jun 10, 2009 at 1:46 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jun 9, 2009 at 7:07 PM, Michael Nordmanmicha...@google.com wrote: This is the solution that Firefox 3.5 uses. We use a pool of relatively few OS threads (5 or so iirc). This pool is then scheduled to run worker tasks as they are scheduled. So for example if you create 1000 worker objects, those 5 threads will take turns to execute the initial scripts one at a time. If you then send a message using postMessage to 500 of those workers, and the other 500 calls setTimeout in their initial script, the same threads will take turns to run those 1000 tasks (500 message events, and 500 timer callbacks). This is somewhat simplified, and things are a little more complicated due to how we handle synchronous network loads (during which we freeze and OS thread and remove it from the pool), but the above is the basic idea. / Jonas Thats a really good model. Scalable and degrades nicely. The only problem is with very long running operations where a worker script doesn't return in a timely fashion. If enough of them do that, all others starve. What does FF do about that, or in practice do you anticipate that not being an issue? Webkit dedicates an OS thread per worker. Chrome goes even further (for now at least) with a process per worker. The 1:1 mapping is probably overkill as most workers will probably spend most of their life asleep just waiting for a message. We do see it as a problem, but not big enough of a problem that we needed to solve it in the initial version. It's not really a problem for most types of calculations, as long as the number of threads is larger than the number of cores we'll still finish all tasks as quickly as the CPU is able to. Even for long running operations, if it's operations that the user wants anyway, it doesn't really matter if the jobs are running all in parallel, or staggered after each other. As long as you're keeping all CPU cores busy. There are some scenarios which it doesn't work so well for. For example a worker that works more or less infinitely and produces more and more accurate results the longer it runs. Or something like a fold...@home website which performs calculations as long as the user is on a website and submits them to the server. If enough of those workers are scheduled it will block everything else. This is all solveable of course, there's a lot of tweaking we can do. But we figured we wanted to get some data on how people use workers before spending too much time developing a perfect scheduling solution. I never did like the Gears model (1:1 mapping with a thread). We were stuck with a strong thread affinity due to other constraints (script engines, COM/XPCOM). But we could have allowed multiple workers to reside in a single thread. A thread pool (perhaps per origin) sort of arrangement, where once a worker was put on a particular thread it stayed there until end-of-life. Your FF model has more flexibility. Give a worker a slice (well where slice == run-to-completion) on any thread in the pool, no thread affinity whatsoever (if i understand correctly). / Jonas
Re: [whatwg] Frame advance feature for a paused VIDEO
On Thu, 21 May 2009, Biju wrote: I dont see a way to do frame advance feature for a paused VIDEO. Is there a way to achieve that ? As well as frame backward also. There is no way to do this today, but I imagine we'll add an API for this in due course. It's the first thing on the list of features for the next version, in fact. On Mon, 25 May 2009, Philip J�genstedt wrote: If you pause the video and set currentPosition it should advance to that frame. As long as you know the frame rate you're good to go. All in theory of course, implementations may not be all the way there yet. On Tue, 26 May 2009, Robert O'Callahan wrote: I don't think there is a standard way to expose the frame rate. We might even want something more general than the frame rate, since conceivably you could have a video format where the interval between frames is variable. On Tue, 26 May 2009, Robert O'Callahan wrote: It's more than conceivable, actually --- chained Oggs can have the frame rate varying between segments. So if you're at the last frame of one segment the time till the next frame can be different from the time since the previous frame. On Tue, 26 May 2009, Philip J�genstedt wrote: Indeed, I don't suggest adding an API for exposing the frame rate, I'm just saying that if you know the frame rate by some external means then you can just set currentTime. On Tue, 26 May 2009, Robert O'Callahan wrote: OK, sure. Since there are lots of situations where you don't know the frame rate via external means, it seems new API is needed here. On Mon, 25 May 2009, Jonas Sicking wrote: There doesn't seem to be a way to do so. Definitely something I think we should consider for the next version of the API. I agree with the above comments. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Exposing known data types in a reusable way
On Thu, 21 May 2009, Eduard Pascual wrote: Within 5.4.1 vCard, by the end of the n property description, the spec reads: The value of the fn property a name in one of the following forms: shouldn't it read: The value of the fn property is a name in one of the following forms: ? Maybe this will grant me a seat for posterity on the acknowledgements section =P. Indeed, thanks! -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Naming of Self-closing start tag state
On Thu, 21 May 2009, Geoffrey Sneddon wrote: I think this is a bit of a misnomer, as the current token can be an end tag token (although it will throw a parse error whatever happens once it reaches this state). I suggest renaming it to self-closing tag state. I started doing this, but then I stopped because while the whole name of the state is wrong, changing it at this point would just confuse people who have implementations, and that doesn't seem worth it. If you need to justify to yourself why the Self-closing start tag state can be reached for end tags, just consider that that is why it's a parse error -- it's obviously wrong syntax if it mixes start tag and end tag syntax. If you need to justify to yourself why the state is called self- closing when the syntax in fact has no effect whatsoever, least of all actually closing anything, then you haven't got enough problems, and I recommend volunteering for some community service or something. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Limit on number of parallel Workers.
The current thinking would be a smaller limit per page (i.e. includes all iframes and external scripts), say around 16 workers. Then a global limit for all loaded pages, say around 64 or 128. The benefit of two limits is to reduce the chance of pages behaving differently depending on what other sites are currently loaded. We plan on increasing these limits by a fair amount once we are able to run multiple JS threads in a process. It's just that even when we do that, we'll still want to have some limits, and we wanted to use the same approach now. On Wed, Jun 10, 2009 at 2:56 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Thu, Jun 11, 2009 at 5:24 AM, Drew Wilson atwil...@google.com wrote: That's a great approach. Is the pool of OS threads per-domain, or per browser instance (i.e. can a domain DoS the workers of other domains by firing off several infinite-loop workers)? Seems like having a per-domain thread pool is an ideal solution to this problem. You probably still want a global limit, or else malicious sites can DoS your entire OS by spawning workers in many synthetic domains. Making the limit per-eTLD instead of per-domain would help a bit, but maybe not very much. Same goes for other kinds of resources; there's no really perfect solution to DoS attacks against browsers, AFAICT. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Removing the need for separate feeds
On Fri, 22 May 2009, Dan Brickley wrote: On 22/5/09 09:21, Ian Hickson wrote: On Fri, 22 May 2009, Henri Sivonen wrote: On May 22, 2009, at 09:01, Ian Hickson wrote: USE CASE: Remove the need for feeds to restate the content of HTML pages (i.e. replace Atom with HTML). Did you do some kind of Is this Good for the Web? analysis on this one? That is, do things get better if there's yet another feed format? As far as I can tell, things get better if the feed format and the default output format are the same, yes. Generally, redundant information has tended to lead to problems. Would this include having a mechanism (microdata? xml islands?) that preserves extension markup from Atom feeds? eg. see http://www.ibm.com/developerworks/xml/library/x-extatom1/ Actually the algorithm to convert HTML to Atom doesn't even support all of Atom, let alone extensions. However, it's quite possible to extend HTML itself if it is to be used as a native feed format, as described here: http://wiki.whatwg.org/wiki/FAQ#HTML5_should_support_a_way_for_anyone_to_invent_new_elements.21 On Fri, 22 May 2009, Adrian Sutton wrote: On 22/05/2009 08:21, Ian Hickson i...@hixie.ch wrote: As far as I can tell, things get better if the feed format and the default output format are the same, yes. Generally, redundant information has tended to lead to problems. Can you point to examples of this in relation to the use of feeds in particular? Smylers listed more than I could think of: On Fri, 22 May 2009, Smylers wrote: I can't find examples right now, but I have encountered various problems along these lines in the past, including: * The feed suddenly becomes empty. * A new blog has a 'feed' link, but it never works. * A blog's feed URL changes, but doesn't redirect. * A feed is misformatted in a way which causes it to be ignored. * The content of a feed is misformatted, such that in a feed reader its display is mangled, such as HTML tags and entities showing, or spaces having been squeezed out from around tags such that linked words don't have spaces around them. * The content of a feed has certain critical information, such as an image, stripped from it, such that it makes no sense, or has a different meaning from the full post. * The content of a feed has certain critical mark-up stripped from it, such as sup around exponents in a mathematical expression rendering 36 where 3 to the power of 6 was intended. In all cases the HTML version of the blog had correctly displaying and updating content; only the feed was affected by the issues. This usually left the author unaware of the problem, as they don't subscribe to their own blog. On Fri, 22 May 2009, Adrian Sutton wrote: This feels a lot like jumping the shark and solving a problem that has already been solved at one end (syndicating content) and doesn't exist at the other (syndicated content being out of sync with the HTML version). It seems like defining how one converts HTML to Atom is useful in general even if -- maybe even especially if -- the desire is to use Atom. On Fri, 22 May 2009, Eduard Pascual wrote: While redundant *source* information easily leads to problems, for what I have seen the sites using feeds tend to be almost always dynamic: both the HTML pages and the feeds are generated via server scripts from the *same set of source data*, normally from a database. This is especially true for blogs, and any other CMS-based site, since CMSs normally rely a lot on databases and server-side scripting. So on these cases we don't actually have redundant information, but just multiple ways to retrieve the same information. That seems plausible, yes. For manually authored pages and feeds things would be different; but are there really a significant ammount of such cases out there? I can't say I have seen the entire web (who can?), but among what I have seen, I have never encountered any hand authored feed, except for code examples and similar experimental stuff. On Fri, 22 May 2009, Toby Inkster wrote: Surely this proves the need for a way of extracting feeds from HTML? I don't know if it proves it per se, but it certainly indicates that there is a possible need. I added the section on how to convert HTML pages to Atom based on requests over the years and most recently specifically in the context of the microdata section. It doesn't replace Atom, nor is anyone required to author HTML in any particular way because of this; it merely provides a migration path if one is desired. I think enabling this kind of interoperability between standards can only be good. On Fri, 22 May 2009, Adrian Sutton wrote: On 22/05/2009 11:36, Toby Inkster m...@tobyinkster.co.uk wrote: You never see manually written feeds because people can't be bothered to manually write feeds. So the people who manually author HTML simply don't