Re: [whatwg] Reserving id attribute values?

2009-06-10 Thread Ian Hickson
On Tue, 19 May 2009, Brett Zamir wrote:

 In order to comply with XML ID requirements in XML, and facilitate 
 future transitions to XML, can HTML 5 explicitly encourage id attribute 
 values to follow this pattern (e.g., disallowing numbers for the 
 starting character)?

Why can't we just change the XML ID requirements in XML to be less strict?


 Also, there is this minor errata: 
 http://www.whatwg.org/specs/web-apps/current-work/#refsCSS21 is broken 
 (in section 3.2)

I haven't done any references yet; I'll probably get to them in a couple 
of months.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Cross-domain databases; was: File package protocol and manifest support?

2009-06-10 Thread Ian Hickson
On Wed, 20 May 2009, Brett Zamir wrote:

 I would like to suggest an incremental though I believe significant 
 enhancement to Offline applications/SQLite.
 
 That is, the ability to share a complete database among offline 
 applications according to the URL from which it was made available.
 [...]

On Tue, 19 May 2009, Drew Wilson wrote:

 One alternate approach to providing this support would be via shared, 
 cross-domain workers (yes, workers are my hammer and now everything 
 looks like a nail :) - this seems like one of the canonical uses of 
 cross-domain workers, in fact.
 
 This would be potentially even more secure than a simple shared 
 database, as it would allow the application to programmatically control 
 access from other domains, synchronize updates, etc while allowing 
 better controls over access (read-only, write via specific exposed write 

On Wed, 20 May 2009, Rob Kroeger wrote:
 
 For what it's worth, this was my immediate thought as well upon reading 
 the idea. The database is insufficiently fast on some platforms to 
 server as an IPC mechanism and there are practical limitations with 
 having too many contending transactions so my instinct would be to build 
 large integrated web apps with a shared worker routing data between 
 components.

On Thu, 28 May 2009, Michael Nordman wrote:
 
 I buy this thinking too as a better strategy for integrating web apps.

Based on the above comments, I haven't added the requested feature at this 
time -- let's see if the existing features can be used to do it first.


On Thu, 28 May 2009, Michael Nordman wrote:
 
 But still, the ability to download a fully formed SQL database, and then 
 run SQL against it would be nice.
 
 openDatabaseFromURL(urlToDatabaseFile);
 
 * downloads the database file if needed (per http cache control headers)
 * the database can reside in an appcache (in which case it would be
 subject to appcache'ing rules instead)
 * returns a read-only database object
 
 Of course, there is the issue of the SQL database format.

On Thu, 28 May 2009, Anne van Kesteren wrote:
 
 Would there be a lot of overhead in just doing this through XMLHttpRequest,
 some processing, and the database API?

On Thu, 28 May 2009, Michael Nordman wrote:

 Good question. I think you're suggesting...
 * statementsToCreateAndPopulateSQLDatabase  = httpGet();
 * foreach(statement in above) { execute(statement); }
 * now you get to run queries of interest
 
 Certainly going to use more client-side CPU than downloading a fully 
 formed db file. I think the download size would greater (all of the 
 'INSERT into' text overhead), but thats just a guess. A database 
 containing FTS tables would change things a bit too (even less 
 client-side cpu, but more download size).

On Fri, 29 May 2009, Anne van Kesteren wrote:
 
 There are certainly drawbacks, but given that we still haven't nailed 
 all the details of the database API proposal discussed by the WebApps WG 
 (e.g. the SQL syntax) and given that it has not been deployed widely, it 
 seems somewhat premature to start introducing convenient APIs around it 
 that introduce a significant amount of complexity themselves. Defining 
 the rules for parsing and creating a raw database file in a secure way 
 is a whole new layer of issues and the gain seems small.

On Fri, 29 May 2009, Michael Nordman wrote:
 
 I don't think this feature's time has come yet either. Just food for 
 thought.

I guess we'll wait on this for now.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] HTML5 ruby spec: rp

2009-06-10 Thread Ian Hickson
On Tue, 19 May 2009, Roland Steiner wrote:

 As I am currently in the process of writing an implementation for ruby, I
 was wondering about the constraints put on the content of the rp element
 in the spec:
 If the rp http://dev.w3.org/html5/spec/Overview.html#the-rp-elementelement
 is immediately after an
 rt http://dev.w3.org/html5/spec/Overview.html#the-rt-element element that
 is immediately preceded by another
 rphttp://dev.w3.org/html5/spec/Overview.html#the-rp-elementelement:
 a single character from Unicode character class Pe. Otherwise:
 a single character from Unicode character class Ps.Is there a specific
 reason that rp is constrained in this way? I imagine that someone could
 want to add additional spaces before/after the parenthesis, non-parenthesis
 separators, or, e.g., in a text book write:
 
 *ruby**rp *(reading:*/rprt*Kanji*/rt**rp*) */rpruby
 
 *
 Also note that there isn't such a constraint if one would use CSS rules to
 achieve a similar result (in the absence of proper ruby rendering):
 
 rt:before { content:  (reading: ; }
 rt:after { content: ) ; }

Yeah, I guess this constraint is excessive. I've removed it.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

[whatwg] Expose event.dataTransfer.files accessor to allow file drag and drop

2009-06-10 Thread Adam Barth
SUMMARY

Currently the input element exposes selected files via a files
accessor (i.e., as in https://developer.mozilla.org/En/NsIDOMFileList
and http://www.w3.org/TR/file-upload/).  We should add a similar
accessor to event.dataTransfer to enable drag-and-drop of files onto
web pages.

USE CASE

When interacting with a webmail site, users would like to be able to
attach files to email messages by dragging them onto the browser's
content area, as in desktop email clients.  My understanding is that
this is one of the top user requests for Gmail, for example.

WORK AROUNDS

Currently, webmail sites work around this limitation by using the
fugly input type=file control or by using a plug-in, such as
Flash, to handle file uploads.  Other sites, such as Flickr, work
around this limitation by asking users to download an EXE (i.e., the
Flickr uploader) that handles file drag-and-drop.

PROPOSAL

When the user drags-and-drops files onto a web page, we should expose
those files to the page via a files accessor on the dataTransfer
property of the event object.  This feature is consistent with HTML
5's security model for drag and drop.  There are a number of different
API choices, but this appears to be the cleanest and most consistent
with the future of web pages interacting with local files.

Alternative APIs include getData('File'), as defined in
http://msdn.microsoft.com/en-us/library/ms537658(VS.85).aspx.
However, it does not appear that IE ever implemented this API.  (Also,
note that IE doesn't follow HTML 5's clipboard security model.)
Mozilla has an equivalent API in
event.dataTransfer.mozGetDataAt(application/x-moz-file, 0).

Exposing the files attribute is better than these alternatives because
it lets the web page get an object of type File, which can then be
handed off to a future version of XMLHttpRequest, as in
xhr.send(file), without synchronous access to the disk.

IMPLEMENTATION

WebKit has an experimental implementation of this API in
https://bugs.webkit.org/show_bug.cgi?id=25916.

Adam


Re: [whatwg] Expose event.dataTransfer.files accessor to allow file drag and drop

2009-06-10 Thread Anne van Kesteren
On Wed, 10 Jun 2009 10:37:03 +0200, Adam Barth wha...@adambarth.com wrote:
 SUMMARY

 Currently the input element exposes selected files via a files
 accessor (i.e., as in https://developer.mozilla.org/En/NsIDOMFileList
 and http://www.w3.org/TR/file-upload/).  We should add a similar
 accessor to event.dataTransfer to enable drag-and-drop of files onto
 web pages.

This is indeed very cool, but http://www.w3.org/TR/file-upload/ is very 
unstable (and from 2006!) so it seems that would have to be settle a bit more 
first. At the very minimum a shared understanding of what interfaces we want to 
provide to deal with files.


-- 
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Eduard Pascual
On Fri, May 8, 2009 at 9:57 PM, Ian Hickson i...@hixie.ch wrote:
 [...]
 This has some implications:

  - Each unit of content (recipe in this case) must have its own
   independent page at a distinct URL. This is actually good practice
   anyway today for making content discoverable from search engines, and
   it is compatible with what people already do, so this seems fine.

This is, on a wide range of cases, entirely impossible: while it might
work, and maybe it's even good practice, for contents that can be
represented on the web as a HTML document, it is not achievable for
many other formats. Here are some obvious cases:

Pictures (and other media) used on a page: An author might want to
have protected content, but to allow re-use of some media under
certain licenses. A good example of this are online media libraries,
which have a good deal of media available for reuse but obviously
protect the resources that inherently belong to the site (such as the
site's own logo and design elements): Having a separate page to
describe each resource's licensing is not easily achievable, and may
be completelly out of reach for small sites that handle all their
content by hand (most prominently, desginer's portfolio sites that
offer many of their contents under some attribution license to
promote their work).

Software: I have stated this previously, but here it goes again: just
like with media, it's impossible to simply put a link
rel=license... on a msi package or a tarball. Sure, the package
itself will normally include a file with the text of the corresponding
license(s), but this doesn't help on making the licensing discoverable
by search engines and other forms of web crawlers. It looks like I
should make a page for each of the products (or even each of the
releases), so I can put the link tag there and everybody's happy...
actually, this makes so much sense that I actually already have such
pages for each of my release (even if there aren't many as of now);
but I *can't* put the link on them, because my software is under
more liberal licenses (mostly GPL) than other elements of the page
(such as the site's logo, appearing everywhere on the page, which is
CC-BY-NC-ND), and I obviously don't want such contents to appear on
searches for images that I can modify and use commercially, for
example.

Until now, the best way to approach this need I have seen would be
RDF's triple concept: instead of saying licensed under Y, I'm
trying to say X is licensed under Y, and maybe also and X2 is
licensed under Y2, and this is inherently a triple. I am, however,
open to alternatives (at least on this aspect), as long as they
provide any benefit other than mere vadilation (which I don't even
care about anymore, btw) over currently deployed and available
solutions. I am not sure whether Microdata can handle this case or not
(after all, it is capable of expressing some RDF triples), but the
fact is that I can make my content discoverable by google and yahoo
using CCREL (quite suboptimal, and wouldn't validate on HTML5, but
would still work), but I can't do so using Microdata (which is also
suboptimal, would validate on HTML5, but doesn't work anywhere yet).

Regards,
Eduard Pascual


Re: [whatwg] on bibtex-in-html5

2009-06-10 Thread Ian Hickson
On Wed, 20 May 2009, Bruce D'Arcus wrote:

 Re: the recent microdata work and the subsequent effort to include 
 BibTeX in the spec, I summarized my argument against this on my blog:
 
 http://community.muohio.edu/blogs/darcusb/archives/2009/05/20/on-the-inclusion-of-bibtex-in-html5

| 1. BibTeX is designed for the sciences, that typically only cite
|secondary academic literature. It is thus inadequate for, nor widely
|used, in many fields outside of the sciences: the humanities and law
|being quite obvious examples. For this reason, BibTeX cannot by
|default adequately represent even the use cases Ian has identified.
|For example, there are many citations on Wikipedia that can only be
|represented using effectively useless types such as misc and which
|require new properties to be invented.

We will probably have to increase the coverage in due course, yes. 
However, we should verify that the mechanism works in principle before 
investing the time to extend the vocabulary.


| 2. Related, BibTeX cannot represent much of the data in widely used
|bibliographic applications such as Endnote, RefWorks and Zotero except
|in very general ways.

If such data is important, we can always add support when this becomes 
clear.


| 3. The BibTeX extensibility model puts a rather large burden on inventing
|new properties to accommodate data not in the core model. For example,
|the core model has no way to represent a DOI identifier (this is no
|surprise, as BibTeX was created before DOIs existed). As a
|consequence, people have gradually added this to their BibTeX records
|and styles in a more ad hoc way. This ad hoc approach to extensibility
|has one of two consequences: either the vocabulary terms are
|understood as completely uncontrolled strings, or one needs to
|standardize them. If we assume the first case, we introduce potential
|interoperability problems. If we assume the second, we have an
|organizational and process problem: that the WHATWG and/or the
|W3C-neither of which have expertise in this domain-become the
|gate-keepers for such extensions. In either case, we have a rather
|brittle and anachronistic approach to extension.

I don't see any of this as a problem.


| 4. The BibTeX model conflicts with Dublin Core and with vCard, both of
|which are quite sensibly used elsewhere in the microdata spec to
|encode information related to the document proper. There seems little
|justification in having two different ways to represent a document
|depending on whether on it is THIS document or THAT document.

I don't understand this point. Could you provide an example of this 
conflict?


| 5. Aspects of BibTeX's core model are ambiguous/confusing. For example,
|what number does number refer to? Is it a document number, or an
|issue number?

What's the difference? Why does it matter?


| My suggestion instead?
| 1. reuse Dublin Core and vCard for the generic data: titles,
|creators/contributors, publisher, dates, part/version relations, etc.,
|and only add those properties (volume, issue, pages, editors, etc.)
|that they omit

This seems unduly heavy duty (especially the use of vCard for author 
names) when all that is needed is brief bibliographic entries.


| 2. typing should NOT be handled a bibtex-type property, but the same way
|everything else is typed in the microdata proposal: a global
|identifier

Why?


| 3. make it possible for people to interweave other, richer, vocabularies
|such as bibo within such item descriptions. In other words, extension
|properties should be URIs.

This is already possible.


| 4. define the mapping to RDF of such an item description; can we say,
|for example, that it constitutes a dct:references link from the
|document to the described source?

The mapping to RDF is already defined; further mappings can be done using 
the sameAs mechanism.


On Thu, 21 May 2009, Henri Sivonen wrote:
 
 The set of fields is more of an issue, but it can be fixed by inventing 
 more fields--it doesn't mean the whole base solution needs to be 
 discarded. Fortunately, having custom fields in .bib doesn't break 
 existing pre-Web, pre-ISBN bibliography styles. I've used at least these 
 custom fields:
 
 key: Show this citation pseudo-id in rendering instead of the actual id used
 for matching.
 url: The absolute URL of a resource that is on the Web.
 refdate: The date when the author made the reference to an ephemeral source
 such as a Web page.
 isbn: The ISBN of a publication.
 stdnumber: RFC or ISO number. e.g. RFC 2397 or ISO/IEC 10646:2003(E)
 
 Particularly the 'url' and 'isbn' field names should be obvious and 
 uncontroversial additions.

url seems widely supported and I included it. I haven't added any other 
fields yet; I imagine that once this feature gets traction, we'll have 
more direct data as to which fields would be most useful, and 

Re: [whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

2009-06-10 Thread Eduard Pascual
First of all, Ian, thank for your reply. I appreciate any opinions on
this subject.

On Wed, Jun 10, 2009 at 1:29 AM, Ian Hicksoni...@hixie.ch wrote:
 This proposal is very similar to RDF EASE.
Indeed, they are both CSS-based, and they fulfill similar purposes.
Let me, however, highlight some differences:
1st, EASE is tighly bound to RDFa. However, RDFa is meant for
embeeding metadata, and was built with that purpose on mind; while
EASE is meant for linked metadata, so builiding it on top of RDFa's
embeeding constructs is quite unnatural. In contrast, CRDF is build
from CSS's syntax and RDF's (not RDFa's) concepts: it only shares with
RDFa what they both inherit from RDF: the concepts and data model.
2nd, EASE is meant to be complimentary to RDFa: they address (or
attempt to address) different use cases / needs (embeeding vs.
linking). On the other hand (more on this below), CRDF attempts to
address both cases, plus the case where an hybrid approach is
appropriate (inlining some metadata, and linking other).

 While I sympathise with the
 goal of making semantic extraction easier, I feel this approach has
 several fundamental problems which make it inappropriate for the specific
 use cases that were brought up and which resulted in the microdata
 proposal:

  * It separates (by design) the semantics from the data with those
   semantics.
That's not accurate. CRDF *allows* separating the semantics, but
doesn't require to do so. Everything could be inlined, and the
possibility of separation is just for when it is needed.

   I think this is a level of indirection too far -- when
   something is a heading, it should _be_ a heading, it shouldn't be
   labeled opaquely with a transformation sheet elsewhere defining that is
   maps to the heading semantic.
That doesn't make much sense. When something is a heading, it *is* a
heading. What do you mean by should be a heading?. CRDF (as well as
many other syntaxes for RDF) allow parsers that don't know the
specific semantics of the markup language to find out that something
is actually a heading anyway; and allows expressing semantics that the
markup language has no direct support for (for example, is it a
site-section heading? a news heading? an iguana's name (used as the
main title for each iguana's page on the iguana collection example)?
something else?).

  * It is even more brittle in the face of copy-and-paste and regular
   maintenance than, say, namespace prefixes. It is very easy to forget to
   copy the semantic transformation rules. It is very easy to edit the
   document such that the selectors no longer match what they used to
   match. It's not at all obvious from looking at the page that there are
   semantics there.
I think the whole copy-paste thing should be broken on two separate scenarios:
Copy-pasting source code: with the next version of the document (which
I'm already cleaning up, and will allow @namespace rules inside the
inlining attribute), this will be as brittle (and as resillient) as
prefixes are: when a fragment that includes the @namespaces or
prefixes it needs is copy-pasted, it will work as expected; OTOH, if a
rule relies on a namespace that is not available (declared outside of
the copy-pasted fragment), the rule will just be ignored. The risk of
the copied code clashing with declarations on its new location is
lower than it may seem: an author who is already adding CRDF code to
his pages is quite likely to review the code he's copying for the
semantics that may be there; and authoring tools that automatically
add semantic code should review whether things make sense or not when
pasting code on them (for example, invalid/redundant properties
could/should be notified to the author).
Copy-pasting content: currently, browser support for copy-pasting CSS
styled content is mediocre and inconsistent (some browsers do it
right, some don't, some don't even try), but this is already more than
what is supported for RDFa, Microdata, or other semantic formats. With
a bit of luck, pressure for browsers to include CRDF properties when
copying content could help to get decent support for CSS properties as
well (since most of the code for these tasks would be shared).

  * It relies on selectors to do something subtle. Authors have a great
   deal of trouble understanding selectors -- if you watch a typical Web
   authors writing CSS, he will either use just class selectors, or he
   will write selectors by trial and error until he gets the style he
   wants. This isn't fatal for CSS because you can see the results right
   there; for something as subtle as semantic data mining, it is extremely
   likely that authors will make mistakes that turn their data into
   garbage, which would make the feature impractical for large-scale use.
It relies on selectors to do what they do: select things. Nobody is
*asking* authors to make use of over-complicated selectors for each
piece of metadata they want to add; but CRDF tries to *allow* using
any valid 

Re: [whatwg] on bibtex-in-html5

2009-06-10 Thread Julian Reschke

Ian Hickson wrote:

...
So far based on my experience with the Workers, Storage, Web Sockets, and 
Server-sent Events sections, I'm not convinced that the advantage of 
getting more review is real. Those sections in particular got more review 
while in the HTML5 spec proper than they have since.

...


So you are putting stuff you're personally interested in into the HTML5 
spec, so that people read it?


What a cunning plan.

BR, Julian




Re: [whatwg] Expose event.dataTransfer.files accessor to allow file drag and drop

2009-06-10 Thread Thomas Broyer
On Wed, Jun 10, 2009 at 10:37 AM, Adam Barth wrote:
 SUMMARY

 Currently the input element exposes selected files via a files
 accessor (i.e., as in https://developer.mozilla.org/En/NsIDOMFileList
 and http://www.w3.org/TR/file-upload/).  We should add a similar
 accessor to event.dataTransfer to enable drag-and-drop of files onto
 web pages.
[...]
 Alternative APIs include getData('File'), as defined in
 http://msdn.microsoft.com/en-us/library/ms537658(VS.85).aspx.
 However, it does not appear that IE ever implemented this API.  (Also,
 note that IE doesn't follow HTML 5's clipboard security model.)
 Mozilla has an equivalent API in
 event.dataTransfer.mozGetDataAt(application/x-moz-file, 0).

It should be noted also that Adobe AIR has
getData(application/x-vnd.adobe.air.file-list) [1] and Gears
(starting with 0.5.21.0, as announced at Google I/O) has its own (not
yet documented) API with a files property [2] (as requested here).


[1] 
http://help.adobe.com/en_US/AIR/1.5/devappshtml/WS7709855E-7162-45d1-8224-3D4DADC1B2D7.html
[2] 
http://code.google.com/p/gears/source/browse/trunk/gears/test/manual/drag_and_drop.html#109

-- 
Thomas Broyer


Re: [whatwg] on bibtex-in-html5

2009-06-10 Thread Bruce D'Arcus
Am cc-ing he Zoteor dev list just for posterity ...

On Wed, Jun 10, 2009 at 5:44 AM, Ian Hicksoni...@hixie.ch wrote:
 On Wed, 20 May 2009, Bruce D'Arcus wrote:

 Re: the recent microdata work and the subsequent effort to include
 BibTeX in the spec, I summarized my argument against this on my blog:

 http://community.muohio.edu/blogs/darcusb/archives/2009/05/20/on-the-inclusion-of-bibtex-in-html5

 | 1. BibTeX is designed for the sciences, that typically only cite
 |    secondary academic literature. It is thus inadequate for, nor widely
 |    used, in many fields outside of the sciences: the humanities and law
 |    being quite obvious examples. For this reason, BibTeX cannot by
 |    default adequately represent even the use cases Ian has identified.
 |    For example, there are many citations on Wikipedia that can only be
 |    represented using effectively useless types such as misc and which
 |    require new properties to be invented.

 We will probably have to increase the coverage in due course, yes.
 However, we should verify that the mechanism works in principle before
 investing the time to extend the vocabulary.

No; you should drop this proposal and move it to an experimental annex.

If you do insist, against all reason, in pushing forward with this
without modification, then I suggest you explain how this process of
extension will work. If, as I suspect, it'll be another case of a
centralized authority (you; who have admitted you really know nothing
about this space), then that's a deal-breaker from my perspective.

 | 2. Related, BibTeX cannot represent much of the data in widely used
 |    bibliographic applications such as Endnote, RefWorks and Zotero except
 |    in very general ways.

 If such data is important, we can always add support when this becomes
 clear.

Man this is frustrating.

 | 3. The BibTeX extensibility model puts a rather large burden on inventing
 |    new properties to accommodate data not in the core model. For example,
 |    the core model has no way to represent a DOI identifier (this is no
 |    surprise, as BibTeX was created before DOIs existed). As a
 |    consequence, people have gradually added this to their BibTeX records
 |    and styles in a more ad hoc way. This ad hoc approach to extensibility
 |    has one of two consequences: either the vocabulary terms are
 |    understood as completely uncontrolled strings, or one needs to
 |    standardize them. If we assume the first case, we introduce potential
 |    interoperability problems. If we assume the second, we have an
 |    organizational and process problem: that the WHATWG and/or the
 |    W3C-neither of which have expertise in this domain-become the
 |    gate-keepers for such extensions. In either case, we have a rather
 |    brittle and anachronistic approach to extension.

 I don't see any of this as a problem.

The problem, to repeat myself again, is related to the above we'll
extend it as we see fit issue.

The two biggest problems in bibtex are two properties:

book
journal

They're a problem because they're both horribly concrete/narrow, and
(arguably) redundant.

If those were instead replaced with something more generic like either:

1) publication-title

... or, better yet ...

2) a nested/related object (call it publication or container or isPartOf)

... then extension becomes easier. If I need to encode a newspaper
article, then I just do:

title = Some Article
publication-title = Some Newspaper

.. or (better, because I can attach other information to the container):

title = Some Article
publication = [ title = Some Newspaper ]

As is, you need to add stuff like this just to resolve the problems
I've repeayedly pointed out:

newspaper-title
magazine-title
court-reporter-title
television-program-title
radio-program-title

Aside: of course, some of the above could be collapsed into more
generic stuff like broadcast-title, but I'm just following the same,
broken, approach as bibtex.

This stuff isn't theoretical Ian. Just look through this wikipedia
page, for example:

http://en.wikipedia.org/wiki/Guantanamo_Bay_detention_camp

The citations include references to legal cases and briefs, and news
articles (television, radio and print). Your proposal doesn't cover
this stuff.

OTOH, applications like Zoteor can.

 | 4. The BibTeX model conflicts with Dublin Core and with vCard, both of
 |    which are quite sensibly used elsewhere in the microdata spec to
 |    encode information related to the document proper. There seems little
 |    justification in having two different ways to represent a document
 |    depending on whether on it is THIS document or THAT document.

 I don't understand this point. Could you provide an example of this
 conflict?

Here's an academic article in an open access biology journal.

http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.182

THIS article refers to the metadata about the document proper, with
the title Accelerated Adaptive Evolution 

Re: [whatwg] Reserving id attribute values?

2009-06-10 Thread Giovanni Campagna
2009/6/10 Ian Hickson i...@hixie.ch:
 On Tue, 19 May 2009, Brett Zamir wrote:

 In order to comply with XML ID requirements in XML, and facilitate
 future transitions to XML, can HTML 5 explicitly encourage id attribute
 values to follow this pattern (e.g., disallowing numbers for the
 starting character)?

 Why can't we just change the XML ID requirements in XML to be less strict?

Because you are not part of the XMLCore WG, because XML is a
Recommendation and because ID has been a Name from the very beginning
of SGML. If something should be changed, it is the HTML5 draft.
Naturally it should be only an author conformance requirement.


 Also, there is this minor errata:
 http://www.whatwg.org/specs/web-apps/current-work/#refsCSS21 is broken
 (in section 3.2)

 I haven't done any references yet; I'll probably get to them in a couple
 of months.

 --
 Ian Hickson               U+1047E                )\._.,--,'``.    fL
 http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
 Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Giovanni


Re: [whatwg] on bibtex-in-html5

2009-06-10 Thread simon
  | 1. BibTeX is designed for the sciences, that typically only cite
  |    secondary academic literature. It is thus inadequate for, nor widely
  |    used, in many fields outside of the sciences: the humanities and law
  |    being quite obvious examples. For this reason, BibTeX cannot by
  |    default adequately represent even the use cases Ian has identified.
  |    For example, there are many citations on Wikipedia that can only be
  |    represented using effectively useless types such as misc and which
  |    require new properties to be invented.
 
  We will probably have to increase the coverage in due course, yes.
  However, we should verify that the mechanism works in principle before
  investing the time to extend the vocabulary.
 
 No; you should drop this proposal and move it to an experimental annex.
 
 If you do insist, against all reason, in pushing forward with this
 without modification, then I suggest you explain how this process of
 extension will work. If, as I suspect, it'll be another case of a
 centralized authority (you; who have admitted you really know nothing
 about this space), then that's a deal-breaker from my perspective.

Related to this I want to remark some things on a more general level: We 
currently experience major changes in the world of bibliographic software. At 
least, this is how I experience it. After years of limited and/or closed 
formats and models like BibTeX or Endnote we finally see new models like CSL or 
biblatex emerging which try to learn from the lessons from the past. Of course, 
I do not know how things will evolve, but looking at the success of solutions 
like Zotero I think it's not so bold to say that things will change quite a bit 
in the coming years.

And then we have HTML5, an emerging standard which is now getting support by 
the newest and latest browsers. I do know even less how HTML5 will evolve, what 
impact it will have on the web. But it's probably fair to say that widespread 
adoption of HTML5 will not happen overnight.

Honestly, I really don't get why a coming web standard should support a 
bibliographic standard which is obviously outdated. The fact that BibTeX is 
widely used is really a non argument, because if we follow this logic we wont 
have any development. By the same logic you should avoid something like video 
– after all, there isn't any support for it *yet*. If HTML5 wants to be 
forward-looking, it certainly shouldn't adopt a twenty years old standard but 
should instead try to support something new which is really up to date and has 
chance if being useful in the future.

simon





Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Tab Atkins Jr.
On Wed, Jun 10, 2009 at 3:46 AM, Eduard Pascualherenva...@gmail.com wrote:
 On Fri, May 8, 2009 at 9:57 PM, Ian Hickson i...@hixie.ch wrote:
 [...]
 This has some implications:

  - Each unit of content (recipe in this case) must have its own
   independent page at a distinct URL. This is actually good practice
   anyway today for making content discoverable from search engines, and
   it is compatible with what people already do, so this seems fine.

 This is, on a wide range of cases, entirely impossible: while it might
 work, and maybe it's even good practice, for contents that can be
 represented on the web as a HTML document, it is not achievable for
 many other formats. Here are some obvious cases:

 Pictures (and other media) used on a page: An author might want to
 have protected content, but to allow re-use of some media under
 certain licenses. A good example of this are online media libraries,
 which have a good deal of media available for reuse but obviously
 protect the resources that inherently belong to the site (such as the
 site's own logo and design elements): Having a separate page to
 describe each resource's licensing is not easily achievable, and may
 be completelly out of reach for small sites that handle all their
 content by hand (most prominently, desginer's portfolio sites that
 offer many of their contents under some attribution license to
 promote their work).

Even on small sites, though, if they have a picture gallery they
almost certainly have the ability to view each picture individually as
well, usually by clicking on the picture itself.  That's the page
you'd put the license information on.

I think it's fundamentally rare to have a bunch of resources that (a)
*only* exist grouped together on a single page, and (b) need different
licenses.

 Software: I have stated this previously, but here it goes again: just
 like with media, it's impossible to simply put a link
 rel=license... on a msi package or a tarball. Sure, the package
 itself will normally include a file with the text of the corresponding
 license(s), but this doesn't help on making the licensing discoverable
 by search engines and other forms of web crawlers. It looks like I
 should make a page for each of the products (or even each of the
 releases), so I can put the link tag there and everybody's happy...
 actually, this makes so much sense that I actually already have such
 pages for each of my release (even if there aren't many as of now);
 but I *can't* put the link on them, because my software is under
 more liberal licenses (mostly GPL) than other elements of the page
 (such as the site's logo, appearing everywhere on the page, which is
 CC-BY-NC-ND), and I obviously don't want such contents to appear on
 searches for images that I can modify and use commercially, for
 example.

As Ian stated, link rel=license does *not* mean This entire page
is covered under the linked license, but rather The primary content
of this page is covered under the linked license.  This is different
from preliminary definitions of rel=license, but it's how it is
overwhelmingly used in practice, and so HTML5 redefined it to match.

So, since you already create separate pages for each release, you're
completely fine.  ^_^

 Until now, the best way to approach this need I have seen would be
 RDF's triple concept: instead of saying licensed under Y, I'm
 trying to say X is licensed under Y, and maybe also and X2 is
 licensed under Y2, and this is inherently a triple. I am, however,
 open to alternatives (at least on this aspect), as long as they
 provide any benefit other than mere vadilation (which I don't even
 care about anymore, btw) over currently deployed and available
 solutions. I am not sure whether Microdata can handle this case or not
 (after all, it is capable of expressing some RDF triples), but the
 fact is that I can make my content discoverable by google and yahoo
 using CCREL (quite suboptimal, and wouldn't validate on HTML5, but
 would still work), but I can't do so using Microdata (which is also
 suboptimal, would validate on HTML5, but doesn't work anywhere yet).

Of course microdata can handle it.  Assuming a theoretical Microdata
vocab for Creative Commons, you can do it with:

div item
  div itemprop=cc.work
foo...
  /div
  a itemprop=cc.license
href=http://creativecommons.org/license/cc-gpl;This work is licensed
under the GNU GPL, version 3 or later/a
/div

(You can also separate the license markup from your work by slapping
an id on your work and using @subject on the license link.)

Remember, Microdata and RDF are essentially identical in nearly all
realistic cases, with only a few small differences - namely that
Microdata forms a tree structure rather than a more general graph.
That's rarely relevant, however, and nearly all common metadata
annotations can be done just fine as a tree.

Though, of course, as long as your work was the primary content of the
page, you can skip Microdata entirely and 

Re: [whatwg] [html5] r3218 - [] (0) Mention frameset event handler attributes (they work like body's apparently)

2009-06-10 Thread Simon Pieters

On Wed, 10 Jun 2009 10:31:54 +0200, wha...@whatwg.org wrote:


Author: ianh
Date: 2009-06-10 01:31:52 -0700 (Wed, 10 Jun 2009)
New Revision: 3218

Modified:
   index
   source
Log:
[] (0) Mention frameset event handler attributes (they work like  
body's apparently)




+  pIn addition, codeframeset/code elements must implement the
+  following interface:/p
+
+  pre class=idlinterface dfnHTMLFramesetElement/dfn :  
spanHTMLElement/span {


Should be HTMLFrameSetElement.

rows and cols should probably be in the interface, too.

While you're at it, you could specify HTMLFrameElement. Maybe there are  
other interfaces or members that are currently lacking.


--
Simon Pieters
Opera Software


Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Bruce D'Arcus
On Wed, Jun 10, 2009 at 9:19 AM, Tab Atkins Jr.jackalm...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 3:46 AM, Eduard Pascualherenva...@gmail.com wrote:
 On Fri, May 8, 2009 at 9:57 PM, Ian Hickson i...@hixie.ch wrote:
 [...]
 This has some implications:

  - Each unit of content (recipe in this case) must have its own
   independent page at a distinct URL. This is actually good practice
   anyway today for making content discoverable from search engines, and
   it is compatible with what people already do, so this seems fine.

 This is, on a wide range of cases, entirely impossible: while it might
 work, and maybe it's even good practice, for contents that can be
 represented on the web as a HTML document, it is not achievable for
 many other formats. Here are some obvious cases:

 Pictures (and other media) used on a page: An author might want to
 have protected content, but to allow re-use of some media under
 certain licenses. A good example of this are online media libraries,
 which have a good deal of media available for reuse but obviously
 protect the resources that inherently belong to the site (such as the
 site's own logo and design elements): Having a separate page to
 describe each resource's licensing is not easily achievable, and may
 be completelly out of reach for small sites that handle all their
 content by hand (most prominently, desginer's portfolio sites that
 offer many of their contents under some attribution license to
 promote their work).

 Even on small sites, though, if they have a picture gallery they
 almost certainly have the ability to view each picture individually as
 well, usually by clicking on the picture itself.  That's the page
 you'd put the license information on.

What about the case where you have a JS-based viewer, and so when the
user clicks a photo, they do not go to a separate page, but instead
get a pop-up viewer?

Surely that's common, and it's entirely feasible that different photos
on the page would have different licenses.

Or another case: a weblog that includes third-party photo content
(could be your own photos too). You want to label your blog text with
one license, and the linked photos with another.

...

Bruce


Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-06-10 Thread Brett Zamir

- Original Message 

From: Ian Hickson i...@hixie.ch
To: Brett Zamir bret...@yahoo.com
Cc: wha...@whatwg.org
Sent: Wednesday, June 10, 2009 11:48:09 AM
Subject: Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?
 
On Mon, 18 May 2009, Brett Zamir wrote:
 
 Has any thought been given to standardizing on at least a part of DOM 
 Level 3 Load and Save in HTML5?
 
 DOM3 Load and Save is already standardised as far as I can tell. I don't 
 see why HTML5 would have to say anything about it.

The hope was that there would be some added impetus to have browsers settle on 
a standard way of doing this, since to my knowledge, it looks to me like only 
Opera has implemented DOM Level 3 LS (Mozilla for one hasn't seemed keen on 
implementing it), and I'm afraid this otherwise very important functionality 
will remain unimplemented or unstandardized across browsers. DOMParser() and 
XMLSerializer() may be available in more than just Mozilla, but are not 
standardized, and innerHTML, along with the other issues Boris mentioned in the 
DOMParser / XMLSerializer thread (e.g., being able to parse by content type 
like XML), just doesn't sound appropriate to handle plain XML document when its 
called innerHTML.

thanks,
Brett



Re: [whatwg] Feedbacl

2009-06-10 Thread Tab Atkins Jr.
On Tue, Jun 9, 2009 at 10:59 PM, Mike
Weissenbornmike_weissenb...@yahoo.com wrote:
 1) I've used frames in many web pages, and I see this is being dropped.  I
 typically have a selection frame and a result frame.  So links clicked on in
 the 1st frame show up in the second frame.   I then never have to worry
 about managing what's in each frame.  For many pages I can likely use a
 block element like a DIV, but my ISP has size limitations and I have spread
 my pages onto several sites.   I have no problems switching to something
 else but I didn't see anything in the specs except opening a new window to
 accomplish this.   If something else is being used, how will this be
 compatible with older browsers.

In general, frames are bad.  They break bookmarking, are horrible for
accessibility, and don't do anything that can't be accomplished
relatively easily in better ways.

For your particular use-case, it appears you have a really weird ISP.
I'd suggest leaving them and getting a competent one.  ^_^  I can
suggest some privately if you'd like.  Your current setup will be next
to impossible to do properly.

It seems like what you're doing currently is putting common 'site'
navigation in one frame, and page contents in another.  Generally the
way this is done without frames is to use some server-side language
(PHP, etc.) to 'build' your pages, combining one or two 'templates'
files with a 'content' file so that it makes a full page.  That way
you can modify the template files once and have the change reflected
across the entire site automatically.  This is usually rather trivial
to implement, and in the end you have a page with none of the problems
that frames does.

 2) I am perhaps one of the few I know to use Xforms and I am excited about
 being able to have like capabilities in all browsers.  The implementation
 image I saw looked somewhat different and didn't really describe  what new,
 changed or obsolete.  Personally I want the same capabilities of Xforms;
 being able to save locally, FTP. or URL and this wasn't really identified.
 I don't mind having to make change, I just wamt it to work.

 Still on Xforms I would like additional functionality, I think you may have
 dealt with, is being able to reformat/reorder the data via CSS or a datagrid
 to a format the user wishes the data to be viewwed in.  Obviously this may
 be define via code, but I'm hoping the WebForms implementation will allow
 for things such as sortable columns,, re-order columns,  hide/show
 columns...

 I don't know if the subject of data binding has ever come up.  I  like the
 data binding in IE, however other browsers don't support this ability and I
 have to use binding in IE and Xforms for firefox.  I would really benifit
 from being able to use the same code for both.  I did notice a Local Storage
 componenet, which I hope some consistent client call  can be done to  Post
 or Sync these to a URL...

Don't know much about XForms, and haven't had cause to look into them,
so can't help you here.

 3) Xforms or not, I hope anything displayable can be formated appropriately
 using CSS.  There seems to be many browser specific formating settings, is
 there a way to consolidate these with this release to iliminate or reduce
 browse specific CSS settings.

There are very good reasons why no browser allows you full control
over form element styling; namely, security.  A few elements (mainly
input type=file) must be handled very carefully to make sure they
can't be abused, and allowing arbitrary styling pushes the door wide
open.

Regardless, though, this is a CSS issue, not an HTML issue, and so
should be on the CSS mailing list.

 4) not being able to implement #3, somehow within CSS it would then be nice
 to be able have some type of IF statement so additional CSS can be included
 or excluded for non-complient browsers...   Even down the road, the ability
 to include/exclude imports based on broweser capabilities could benifit
 many.  Unless defined, browser builders will continue to build there own
 settings.  Im sure this is out of your control, but perhaps an IF isn't.  I
 hate the idea of having to create a different presentation based on the
 browser, but how does one ever ensure someones browser is compatable or the
 content is dsiplayed appropriately.

Again, CSS issue, not HTML.

 5) On the CSS, I'm sure builders/browser developer would love an XML
 format.   If there are no CSS format changes perhaps this can be identified
 as a future enhancement/direction.  CSS seems to be areal oddball format
 compared to everything else.

Been discussed (though possibly as an April Fools?).  Doesn't seem to
be any real reason to do this, other than that some people already
have xml generators lying around and would like to use them.

And once more, CSS issue, not HTML.  ^_^

 6)  I did see some comment about user defined variables in the FAQ.  I see
 know reason why if enbed something called MIKE in an html file and the CSS
 attributes 

Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Julian Reschke

Jeff Walden wrote:

...
Maybe I'm the only person who thinks it (I'd like to hope I'm merely the 
only person to say it, unless I've missed its mention in the past), but 
this feels like mission creep to me.

...


You're not the only person.

BR, Julian


Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Julian Reschke

Leif Halvard Silli wrote:

...

and  there are a number of folks who disagree (not just us in RDFa),
including at least two RECs (RDFa and GRDDL).


Is this claim based on a mere comparison of the description of those 
link relations in said specifications? Perhaps some of the disagreements 
are merely a different wording?

...


As a matter of fact I don't see RDFa using @profile.


The point is: if you assume that @rel=foo always means the same thing,
then many folks believe you're already violating the HTML spec, which
specifically uses @profile to modulate the meaning of @rel, and
sometimes via another level of indirection.


Where does nottingham draft define anything that contradicts the default 
HTML 401 profile?  Authors will often assume that rel=foo does means 
the same thing wherever it appears, hence a central register is a 
benefit so that specification writers and profile writers can know what 
the standard semantics are.


The Web Linking draft does not override anything in HTML 4.01. It just 
states that generic link relations are a good idea, creates an IANA 
registry for them, and defines how to use them in the HTTP Link header.


That being said I *do* believe that it's an incredibly bad idea on using 
the same relation name for different things.



...


BR, Julian


Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Bruce D'Arcus
On Wed, Jun 10, 2009 at 10:05 AM, Tab Atkins Jr.jackalm...@gmail.com wrote:

...

 What about the case where you have a JS-based viewer, and so when the
 user clicks a photo, they do not go to a separate page, but instead
 get a pop-up viewer?

 That is indeed a valid concern.  The obvious way to address it is to
 have a permalink on the JS popup, which will send you to an individual
 page with that content where the license info is located.  In this
 scenario the JS viewer is merely a convenience, allowing you to view
 the pictures without a page refresh, rather than a necessity.
 Hopefully that's true anyway, for accessibility reasons!

 Thus you get the best of both worlds - machine-readable data on the
 individual pages, and you can still put human-readable license info on
 the JS popup.

But why can't one have the best of both worlds without having to go
to separate pages for each photo?

 Surely that's common, and it's entirely feasible that different photos
 on the page would have different licenses.

 I don't think it's that common for different photos on the page to
 have different licenses (and preventing that scenario is just one more
 reason to fight license proliferation), but even if true it's covered
 by the above.

Depends what you mean by covered. I'd say the RDFa examples of this
cover it better in the sense that they don't impose an arbitrary
restriction that the license only applies to a single object (or I
suppose group of objects).

 Or another case: a weblog that includes third-party photo content
 (could be your own photos too). You want to label your blog text with
 one license, and the linked photos with another.

 This is indeed not covered by @rel=license.  Is it necessary to embed
 the separate licensing information for the pictures in a
 machine-readable way?  It seems that just putting a human-readable
 license link on each picture would work pretty well.

This isn't really my area, but I could imagine an organization (in
particular) wanting to include machine-readable license links (a la
CC).

Bruce


Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Tab Atkins Jr.
On Wed, Jun 10, 2009 at 9:37 AM, Bruce D'Arcusbdar...@gmail.com wrote:
 On Wed, Jun 10, 2009 at 10:05 AM, Tab Atkins Jr.jackalm...@gmail.com wrote:
 What about the case where you have a JS-based viewer, and so when the
 user clicks a photo, they do not go to a separate page, but instead
 get a pop-up viewer?

 That is indeed a valid concern.  The obvious way to address it is to
 have a permalink on the JS popup, which will send you to an individual
 page with that content where the license info is located.  In this
 scenario the JS viewer is merely a convenience, allowing you to view
 the pictures without a page refresh, rather than a necessity.
 Hopefully that's true anyway, for accessibility reasons!

 Thus you get the best of both worlds - machine-readable data on the
 individual pages, and you can still put human-readable license info on
 the JS popup.

 But why can't one have the best of both worlds without having to go
 to separate pages for each photo?

Hopefully you have a separate page for each photo *anyway*.  If you
don't - that is, if you only have a thumbnails page, and then a
js-based fullsize viewer - your page is pretty crappy in terms of
accessibility and discoverability.

Given that of course we all value making our pages accessible ^_^, the
problem is already solved.  The js-based viewer is merely a
convenience for those that can use it, and license information can be
embedded on the individual pages.

 Surely that's common, and it's entirely feasible that different photos
 on the page would have different licenses.

 I don't think it's that common for different photos on the page to
 have different licenses (and preventing that scenario is just one more
 reason to fight license proliferation), but even if true it's covered
 by the above.

 Depends what you mean by covered. I'd say the RDFa examples of this
 cover it better in the sense that they don't impose an arbitrary
 restriction that the license only applies to a single object (or I
 suppose group of objects).

The restriction is far from arbitrary - it makes it dead-simple.  Any
solution that allows you to assign different licenses to various
pieces of content on a single page in a machine-readable way is
necessarily more complex.  It's not apparent in these examples that
anything more complex is necessary.

Regardless, though, the situation is *indeed* covered.  The fact that
you can imagine a slightly different solution doesn't change the fact
that existing markup is *a* solution, at least for any halfway decent
site design.

 Or another case: a weblog that includes third-party photo content
 (could be your own photos too). You want to label your blog text with
 one license, and the linked photos with another.

 This is indeed not covered by @rel=license.  Is it necessary to embed
 the separate licensing information for the pictures in a
 machine-readable way?  It seems that just putting a human-readable
 license link on each picture would work pretty well.

 This isn't really my area, but I could imagine an organization (in
 particular) wanting to include machine-readable license links (a la
 CC).

Can you illustrate this more plainly?

~TJ


Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-06-10 Thread Michael A. Puls II

On Wed, 10 Jun 2009 09:49:08 -0400, Brett Zamir bret...@yahoo.com wrote:


- Original Message 


From: Ian Hickson i...@hixie.ch
To: Brett Zamir bret...@yahoo.com
Cc: wha...@whatwg.org
Sent: Wednesday, June 10, 2009 11:48:09 AM
Subject: Re: [whatwg] DOM3 Load and Save for simple  
parsing/serialization?


On Mon, 18 May 2009, Brett Zamir wrote:


Has any thought been given to standardizing on at least a part of DOM
Level 3 Load and Save in HTML5?


DOM3 Load and Save is already standardised as far as I can tell. I don't
see why HTML5 would have to say anything about it.
The hope was that there would be some added impetus to have browsers  
settle on a standard way of doing this, since to my knowledge, it looks  
to me like only Opera has implemented DOM Level 3 LS


Opera's implementation is buggy. The async version never fires a load  
event, handling of errors is all messed up and some functions don't work.  
It's pretty much useless except for synchronous loading in perfect  
conditions.


It seems that everyone wants DOM3 LS to die and to have everyone use JS  
to make their own wrapper around XHR + DOMParser + XMLSerializer etc. to  
do what DOM3 LS does.


--
Michael


Re: [whatwg] Expose event.dataTransfer.files accessor to allow filedrag and drop

2009-06-10 Thread Kristof Zelechovski
Microsoft has recently invented and deployed a custom ActiveX component to
drop local files onto Live Spaces.  This component is undocumented and it is
probably limited to the Spaces service.
Chris



Re: [whatwg] Helping people seaching for content filtered by license

2009-06-10 Thread Kristof Zelechovski
A JavaScript-based viewer for images can overlay an image within an IFRAME
and the IFRAME may contain the license link.
HTH,
Chris



Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-06-10 Thread Anne van Kesteren
On Wed, 10 Jun 2009 17:13:28 +0200, Michael A. Puls II shadow2...@gmail.com 
wrote:
 Opera's implementation is buggy. The async version never fires a load  
 event, handling of errors is all messed up and some functions don't  
 work. It's pretty much useless except for synchronous loading in perfect  
 conditions.

We should probably nuke it.


 It seems that everyone wants DOM3 LS to die and to have everyone use JS  
 to make their own wrapper around XHR + DOMParser + XMLSerializer etc. to  
 do what DOM3 LS does.

Yeah, no need for two high-level network APIs.


-- 
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Limit on number of parallel Workers.

2009-06-10 Thread Drew Wilson
That's a great approach. Is the pool of OS threads per-domain, or per
browser instance (i.e. can a domain DoS the workers of other domains by
firing off several infinite-loop workers)? Seems like having a per-domain
thread pool is an ideal solution to this problem.

-atw

On Tue, Jun 9, 2009 at 9:33 PM, Dmitry Titov dim...@chromium.org wrote:

 On Tue, Jun 9, 2009 at 7:07 PM, Michael Nordman micha...@google.comwrote:


 This is the solution that Firefox 3.5 uses. We use a pool of
 relatively few OS threads (5 or so iirc). This pool is then scheduled
 to run worker tasks as they are scheduled. So for example if you
 create 1000 worker objects, those 5 threads will take turns to execute
 the initial scripts one at a time. If you then send a message using
 postMessage to 500 of those workers, and the other 500 calls
 setTimeout in their initial script, the same threads will take turns
 to run those 1000 tasks (500 message events, and 500 timer callbacks).

 This is somewhat simplified, and things are a little more complicated
 due to how we handle synchronous network loads (during which we freeze
 and OS thread and remove it from the pool), but the above is the basic
 idea.

 / Jonas


 Thats a really good model. Scalable and degrades nicely. The only problem
 is with very long running operations where a worker script doesn't return in
 a timely fashion. If enough of them do that, all others starve. What does FF
 do about that, or in practice do you anticipate that not being an issue?

 Webkit dedicates an OS thread per worker. Chrome goes even further (for
 now at least) with a process per worker. The 1:1 mapping is probably
 overkill as most workers will probably spend most of their life asleep just
 waiting for a message.


 Indeed, it seems FF has a pretty good solution for this (at least for
 non-multiprocess case). 1:1 is not scaling well in case of threads and
 especially in case of processes.

 Here http://figushki.com/test/workers/workers.html is a page that can
 create variable number of workers to observe the effects, curious can run it
 in FF3.5, in Safari 4, or in Chromium with '--enable-web-workers' flag.
 Don't click 'add 1000' button in Safari 4 or Chromium if you are not
 prepared to kill the unresponsive browser while the whole system gets
 half-frozen. FF continue to work just fine, well done guys :-)

 Dmitry



Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-06-10 Thread Brett Zamir

From: Anne van Kesteren ann...@opera.com
To: Michael A. Puls II shadow2...@gmail.com; Brett Zamir 
bret...@yahoo.com; Ian Hickson i...@hixie.ch
Cc: wha...@whatwg.org
Sent: Thursday, June 11, 2009 12:31:10 AM
Subject: Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

On Wed, 10 Jun 2009 17:13:28 +0200, Michael A. Puls II shadow2...@gmail.com 
wrote:
 It seems that everyone wants DOM3 LS to die and to have everyone use JS 
 to make their own wrapper around XHR + DOMParser + XMLSerializer etc. to 
 do what DOM3 LS does.

Yeah, no need for two high-level network APIs.

That'd be fine by me if at least DOMParser + XMLSerializer was being officially 
standardized on...

Brett


Re: [whatwg] DOM3 Load and Save for simple parsing/serialization?

2009-06-10 Thread Anne van Kesteren

On Wed, 10 Jun 2009 20:36:30 +0200, Brett Zamir bret...@yahoo.com wrote:
That'd be fine by me if at least DOMParser + XMLSerializer was being  
officially standardized on...


See the separate thread on those objects.


--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Limit on number of parallel Workers.

2009-06-10 Thread Jonas Sicking
On Tue, Jun 9, 2009 at 7:07 PM, Michael Nordmanmicha...@google.com wrote:

 This is the solution that Firefox 3.5 uses. We use a pool of
 relatively few OS threads (5 or so iirc). This pool is then scheduled
 to run worker tasks as they are scheduled. So for example if you
 create 1000 worker objects, those 5 threads will take turns to execute
 the initial scripts one at a time. If you then send a message using
 postMessage to 500 of those workers, and the other 500 calls
 setTimeout in their initial script, the same threads will take turns
 to run those 1000 tasks (500 message events, and 500 timer callbacks).

 This is somewhat simplified, and things are a little more complicated
 due to how we handle synchronous network loads (during which we freeze
 and OS thread and remove it from the pool), but the above is the basic
 idea.

 / Jonas

 Thats a really good model. Scalable and degrades nicely. The only problem is
 with very long running operations where a worker script doesn't return in a
 timely fashion. If enough of them do that, all others starve. What does FF
 do about that, or in practice do you anticipate that not being an issue?
 Webkit dedicates an OS thread per worker. Chrome goes even further (for now
 at least) with a process per worker. The 1:1 mapping is probably overkill as
 most workers will probably spend most of their life asleep just waiting for
 a message.

We do see it as a problem, but not big enough of a problem that we
needed to solve it in the initial version.

It's not really a problem for most types of calculations, as long as
the number of threads is larger than the number of cores we'll still
finish all tasks as quickly as the CPU is able to. Even for long
running operations, if it's operations that the user wants anyway, it
doesn't really matter if the jobs are running all in parallel, or
staggered after each other. As long as you're keeping all CPU cores
busy.

There are some scenarios which it doesn't work so well for. For
example a worker that works more or less infinitely and produces more
and more accurate results the longer it runs. Or something like a
fold...@home website which performs calculations as long as the user
is on a website and submits them to the server.

If enough of those workers are scheduled it will block everything else.

This is all solveable of course, there's a lot of tweaking we can do.
But we figured we wanted to get some data on how people use workers
before spending too much time developing a perfect scheduling
solution.

/ Jonas


Re: [whatwg] Limit on number of parallel Workers.

2009-06-10 Thread Michael Nordman
On Wed, Jun 10, 2009 at 1:46 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Jun 9, 2009 at 7:07 PM, Michael Nordmanmicha...@google.com
 wrote:
 
  This is the solution that Firefox 3.5 uses. We use a pool of
  relatively few OS threads (5 or so iirc). This pool is then scheduled
  to run worker tasks as they are scheduled. So for example if you
  create 1000 worker objects, those 5 threads will take turns to execute
  the initial scripts one at a time. If you then send a message using
  postMessage to 500 of those workers, and the other 500 calls
  setTimeout in their initial script, the same threads will take turns
  to run those 1000 tasks (500 message events, and 500 timer callbacks).
 
  This is somewhat simplified, and things are a little more complicated
  due to how we handle synchronous network loads (during which we freeze
  and OS thread and remove it from the pool), but the above is the basic
  idea.
 
  / Jonas
 
  Thats a really good model. Scalable and degrades nicely. The only problem
 is
  with very long running operations where a worker script doesn't return in
 a
  timely fashion. If enough of them do that, all others starve. What does
 FF
  do about that, or in practice do you anticipate that not being an issue?
  Webkit dedicates an OS thread per worker. Chrome goes even further (for
 now
  at least) with a process per worker. The 1:1 mapping is probably overkill
 as
  most workers will probably spend most of their life asleep just waiting
 for
  a message.

 We do see it as a problem, but not big enough of a problem that we
 needed to solve it in the initial version.

 It's not really a problem for most types of calculations, as long as
 the number of threads is larger than the number of cores we'll still
 finish all tasks as quickly as the CPU is able to. Even for long
 running operations, if it's operations that the user wants anyway, it
 doesn't really matter if the jobs are running all in parallel, or
 staggered after each other. As long as you're keeping all CPU cores
 busy.

 There are some scenarios which it doesn't work so well for. For
 example a worker that works more or less infinitely and produces more
 and more accurate results the longer it runs. Or something like a
 fold...@home website which performs calculations as long as the user
 is on a website and submits them to the server.

 If enough of those workers are scheduled it will block everything else.

 This is all solveable of course, there's a lot of tweaking we can do.
 But we figured we wanted to get some data on how people use workers
 before spending too much time developing a perfect scheduling
 solution.


I never did like the Gears model (1:1 mapping with a thread). We were stuck
with a strong thread affinity due to other constraints (script engines,
COM/XPCOM).
But we could have allowed multiple workers to reside in a single thread.
A thread pool (perhaps per origin) sort of arrangement, where once a worker
was put on a particular thread it stayed there until end-of-life.

Your FF model has more flexibility. Give a worker a slice
(well where slice == run-to-completion) on any thread in the
pool, no thread affinity whatsoever (if i understand correctly).







 / Jonas



Re: [whatwg] Frame advance feature for a paused VIDEO

2009-06-10 Thread Ian Hickson
On Thu, 21 May 2009, Biju wrote:

 I dont see a way to do frame advance feature for a paused VIDEO.
 
 Is there a way to achieve that ?
 As well as frame backward also.

There is no way to do this today, but I imagine we'll add an API for this 
in due course. It's the first thing on the list of features for the next 
version, in fact.


On Mon, 25 May 2009, Philip J�genstedt wrote:
 
 If you pause the video and set currentPosition it should advance to that 
 frame. As long as you know the frame rate you're good to go. All in 
 theory of course, implementations may not be all the way there yet.

On Tue, 26 May 2009, Robert O'Callahan wrote:

 I don't think there is a standard way to expose the frame rate. We might 
 even want something more general than the frame rate, since conceivably 
 you could have a video format where the interval between frames is 
 variable.

On Tue, 26 May 2009, Robert O'Callahan wrote:
 
 It's more than conceivable, actually --- chained Oggs can have the frame 
 rate varying between segments. So if you're at the last frame of one 
 segment the time till the next frame can be different from the time 
 since the previous frame.

On Tue, 26 May 2009, Philip J�genstedt wrote:
 
 Indeed, I don't suggest adding an API for exposing the frame rate, I'm 
 just saying that if you know the frame rate by some external means then 
 you can just set currentTime.

On Tue, 26 May 2009, Robert O'Callahan wrote:
 
 OK, sure. Since there are lots of situations where you don't know the 
 frame rate via external means, it seems new API is needed here.

On Mon, 25 May 2009, Jonas Sicking wrote:
 
 There doesn't seem to be a way to do so. Definitely something I think we 
 should consider for the next version of the API.

I agree with the above comments.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] Exposing known data types in a reusable way

2009-06-10 Thread Ian Hickson
On Thu, 21 May 2009, Eduard Pascual wrote:

 Within 5.4.1 vCard, by the end of the n property description, the
 spec reads:
 The value of the fn property a name in one of the following forms:
 shouldn't it read:
 The value of the fn property is a name in one of the following forms: ?
 
 Maybe this will grant me a seat for posterity on the acknowledgements
 section =P.

Indeed, thanks!

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Naming of Self-closing start tag state

2009-06-10 Thread Ian Hickson
On Thu, 21 May 2009, Geoffrey Sneddon wrote:

 I think this is a bit of a misnomer, as the current token can be an end 
 tag token (although it will throw a parse error whatever happens once it 
 reaches this state). I suggest renaming it to self-closing tag state.

I started doing this, but then I stopped because while the whole name of 
the state is wrong, changing it at this point would just confuse people 
who have implementations, and that doesn't seem worth it.

If you need to justify to yourself why the Self-closing start tag state 
can be reached for end tags, just consider that that is why it's a parse 
error -- it's obviously wrong syntax if it mixes start tag and end tag 
syntax. If you need to justify to yourself why the state is called self- 
closing when the syntax in fact has no effect whatsoever, least of all 
actually closing anything, then you haven't got enough problems, and I 
recommend volunteering for some community service or something.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Limit on number of parallel Workers.

2009-06-10 Thread John Abd-El-Malek
The current thinking would be a smaller limit per page (i.e. includes all
iframes and external scripts), say around 16 workers.  Then a global limit
for all loaded pages, say around 64 or 128.  The benefit of two limits is to
reduce the chance of pages behaving differently depending on what other
sites are currently loaded.
We plan on increasing these limits by a fair amount once we are able to run
multiple JS threads in a process.  It's just that even when we do that,
we'll still want to have some limits, and we wanted to use the same approach
now.

On Wed, Jun 10, 2009 at 2:56 PM, Robert O'Callahan rob...@ocallahan.orgwrote:

 On Thu, Jun 11, 2009 at 5:24 AM, Drew Wilson atwil...@google.com wrote:

 That's a great approach. Is the pool of OS threads per-domain, or per
 browser instance (i.e. can a domain DoS the workers of other domains by
 firing off several infinite-loop workers)? Seems like having a per-domain
 thread pool is an ideal solution to this problem.


 You probably still want a global limit, or else malicious sites can DoS
 your entire OS by spawning workers in many synthetic domains. Making the
 limit per-eTLD instead of per-domain would help a bit, but maybe not very
 much. Same goes for other kinds of resources; there's no really perfect
 solution to DoS attacks against browsers, AFAICT.

 Rob
 --
 He was pierced for our transgressions, he was crushed for our iniquities;
 the punishment that brought us peace was upon him, and by his wounds we are
 healed. We all, like sheep, have gone astray, each of us has turned to his
 own way; and the LORD has laid on him the iniquity of us all. [Isaiah
 53:5-6]



Re: [whatwg] Removing the need for separate feeds

2009-06-10 Thread Ian Hickson
On Fri, 22 May 2009, Dan Brickley wrote:
 On 22/5/09 09:21, Ian Hickson wrote:
  On Fri, 22 May 2009, Henri Sivonen wrote:
   On May 22, 2009, at 09:01, Ian Hickson wrote:
   USE CASE: Remove the need for feeds to restate the content of HTML
pages
   (i.e. replace Atom with HTML).
   Did you do some kind of Is this Good for the Web? analysis on this
   one? That is, do things get better if there's yet another feed format?
  
  As far as I can tell, things get better if the feed format and the default
  output format are the same, yes. Generally, redundant information has
  tended to lead to problems.
 
 Would this include having a mechanism (microdata? xml islands?) that preserves
 extension markup from Atom feeds? eg. see
 http://www.ibm.com/developerworks/xml/library/x-extatom1/

Actually the algorithm to convert HTML to Atom doesn't even support all of 
Atom, let alone extensions. However, it's quite possible to extend HTML 
itself if it is to be used as a native feed format, as described here:

   
http://wiki.whatwg.org/wiki/FAQ#HTML5_should_support_a_way_for_anyone_to_invent_new_elements.21


On Fri, 22 May 2009, Adrian Sutton wrote:
 On 22/05/2009 08:21, Ian Hickson i...@hixie.ch wrote:
  As far as I can tell, things get better if the feed format and the 
  default output format are the same, yes. Generally, redundant 
  information has tended to lead to problems.
 
 Can you point to examples of this in relation to the use of feeds in 
 particular?

Smylers listed more than I could think of:

On Fri, 22 May 2009, Smylers wrote:
 
 I can't find examples right now, but I have encountered various problems 
 along these lines in the past, including:
 
 * The feed suddenly becomes empty.
 * A new blog has a 'feed' link, but it never works.
 * A blog's feed URL changes, but doesn't redirect.
 * A feed is misformatted in a way which causes it to be ignored.
 * The content of a feed is misformatted, such that in a feed reader its
   display is mangled, such as HTML tags and entities showing, or spaces
   having been squeezed out from around tags such that linked words don't
   have spaces around them.
 * The content of a feed has certain critical information, such as an
   image, stripped from it, such that it makes no sense, or has a
   different meaning from the full post.
 * The content of a feed has certain critical mark-up stripped from it,
   such as sup around exponents in a mathematical expression rendering
   36 where 3 to the power of 6 was intended.
 
 In all cases the HTML version of the blog had correctly displaying and 
 updating content; only the feed was affected by the issues.  This 
 usually left the author unaware of the problem, as they don't subscribe 
 to their own blog.


On Fri, 22 May 2009, Adrian Sutton wrote:

 This feels a lot like jumping the shark and solving a problem that has 
 already been solved at one end (syndicating content) and doesn't exist 
 at the other (syndicated content being out of sync with the HTML 
 version).

It seems like defining how one converts HTML to Atom is useful in general 
even if -- maybe even especially if -- the desire is to use Atom.


On Fri, 22 May 2009, Eduard Pascual wrote:

 While redundant *source* information easily leads to problems, for what 
 I have seen the sites using feeds tend to be almost always dynamic: both 
 the HTML pages and the feeds are generated via server scripts from the 
 *same set of source data*, normally from a database. This is especially 
 true for blogs, and any other CMS-based site, since CMSs normally rely a 
 lot on databases and server-side scripting. So on these cases we don't 
 actually have redundant information, but just multiple ways to retrieve 
 the same information.

That seems plausible, yes.


 For manually authored pages and feeds things would be different; but are 
 there really a significant ammount of such cases out there? I can't say 
 I have seen the entire web (who can?), but among what I have seen, I 
 have never encountered any hand authored feed, except for code examples 
 and similar experimental stuff.

On Fri, 22 May 2009, Toby Inkster wrote:
 
 Surely this proves the need for a way of extracting feeds from HTML?

I don't know if it proves it per se, but it certainly indicates that there 
is a possible need.

I added the section on how to convert HTML pages to Atom based on requests 
over the years and most recently specifically in the context of the 
microdata section. It doesn't replace Atom, nor is anyone required to 
author HTML in any particular way because of this; it merely provides a 
migration path if one is desired. I think enabling this kind of 
interoperability between standards can only be good.


On Fri, 22 May 2009, Adrian Sutton wrote:
 On 22/05/2009 11:36, Toby Inkster m...@tobyinkster.co.uk wrote:
  
  You never see manually written feeds because people can't be bothered 
  to manually write feeds. So the people who manually author HTML simply 
  don't