Re: [whatwg] Alternative method of declaring prefixes in RDFa (was Re: RDFa is to structured data, like canvas is to bitmap and SVG is to vector)
Manu Sporny mspo...@digitalbazaar.com, 2009-01-18 19:18 -0500: Speaking as an RDFa Task Force member - we're currently looking at an alternative prefix binding mechanism, so that this: xmlns:foaf=http://xmlns.com/foaf/0.1/; could also be declared like this in non-XML family languages: prefix=foaf=http://xmlns.com/foaf/0.1/; Is there a draft spec proposal for that available yet? Or maybe a URL for an archived mailing-list discussion about it? --Mike -- Michael(tm) Smith http://people.w3.org/mike/
Re: [whatwg] Alternative method of declaring prefixes in RDFa
Michael(tm) Smith m...@w3.org, 2009-01-19 17:40 +0900: Manu Sporny mspo...@digitalbazaar.com, 2009-01-18 19:18 -0500: prefix=foaf=http://xmlns.com/foaf/0.1/; URL for an archived mailing-list discussion about it? OK, I found this: http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/thread.html#msg74 http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/0074.html -- Michael(tm) Smith http://people.w3.org/mike/
Re: [whatwg] Alternative method of declaring prefixes in RDFa (was Re: RDFa is to structured data, like canvas is to bitmap and SVG is to vector)
On Jan 19, 2009, at 02:18, Manu Sporny wrote: Toby A Inkster wrote: So RDFa, as it is currently defined, does need a CURIE binding mechanism. XML namespaces are used for XHTML+RDFa 1.0, but given that namespaces don't work in HTML, an alternative mechanism for defining them is expected, and for consistency would probably be allowed in XHTML too - albeit in a future version of XHTML+RDFa, as 1.0 is already finalised. (I don't speak for the RDFa task force as I am not a member, but I would be surprised if many of them disagreed with me strongly on this.) Speaking as an RDFa Task Force member - we're currently looking at an alternative prefix binding mechanism, so that this: xmlns:foaf=http://xmlns.com/foaf/0.1/; could also be declared like this in non-XML family languages: prefix=foaf=http://xmlns.com/foaf/0.1/; The thought is that this prefix binding mechanism would be available in both XML and non-XML family languages. Considering recent messages in this thread, using full URIs and refraining from declaring 'http' as a namespace prefix in XHTML would be more backwards compatible than minting a new attribute called 'prefix'. (I haven't verified the test results about using full URIs myself.) Even though switching over to 'prefix' in both HTML and XHTML would address the DOM Consistency concern, using them for RDF-like URI mapping would as opposed to XML names would remove the issue of having to pass around compound values and putting them on the same layer on the layer cake would remove most objections related to qnames-in- content, some usual problem with Namespaces in XML would remain: * Brittleness under copy-paste due to prefixes potentially being declared far away from the use of the prefix in source. * Various confusions about the prefix being significant. * The problem of generating nice prefixes algorithmically without maintaining a massive table of a known RDF vocabularies. * Negative savings in syntax length when I given prefix is only used a couple of times in a file. The reason that we used xmlns: was because our charter was to specifically create a mechanism for RDF in XHTML markup. The XML folks would have berated us if we created a new namespace declaration mechanism without using an attribute that already existed for exactly that purpose. The easy way to avoid accusations of inventing another declaration mechanism is not to have a declaration mechanism. URIs already have namespacing built into their structure. You seem to be taking as a given that there needs to be an indirection mechanism for declaring common URI prefixes. As far as I can tell, an indirection mechanism isn't a hard requirement flowing from the RDF data model. After all, N-Triples don't have such a mechanism. That being said, we're now being berated by the WHATWG list for doing the Right Thing per our charter... sometimes you just can't win :) Groups have a say on what goes into their charter, so it's not like a group is powerlessly following a charter forced upon it entirely from the outside. :-) I don't think that the RDFa Task Force is as rigid in their positions as some on this list are claiming... we do understand the issues, are working to resolve issues or educate where possible and desire an open dialog with WHATWG. Great! -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] Alternative method of declaring prefixes in RDFa
Michael(tm) Smith wrote: Michael(tm) Smith m...@w3.org, 2009-01-19 17:40 +0900: Manu Sporny mspo...@digitalbazaar.com, 2009-01-18 19:18 -0500: prefix=foaf=http://xmlns.com/foaf/0.1/; URL for an archived mailing-list discussion about it? OK, I found this: http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/thread.html#msg74 http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Jan/0074.html I believe that the thread started here, @prefix is a small part of the conversation: http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2008Sep/0001.html ... it's fairly involved, and a good bit has changed since the discussion back in September. The goal, though, is to provide a non-XML mechanism for declaring CURIE prefixes. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Website Launch http://blog.digitalbazaar.com/2009/01/16/bitmunk-3-1-website-launch
[whatwg] embedding meta data for copy/paste usages - possible use case for RDF-in-HTML?
If this was discussed already, sorry. There has been so much RDF/meta data discussion that I'm far from on top of it.. I'd like some way to add meta data to a page that could be integrated with the UA's copy/paste commands. For example, if I copy a sentence from Wikipedia and paste it in some word processor, it would be great if the word processor offered to automatically create a bibliographic entry. If I copy the name of one of my Facebook friends and paste it into my OS address book, it would be cool if the contact information was imported automatically. Or maybe I pasted it in my webmail's address book feature, and the same import operation happened.. If I select an E-mail in my webmail and copy it, it would be awesome if my desktop mail client would just import the full E-mail with complete headers and different parts if I just switch to the mail client app and paste. To make such use cases possible I suppose what we need is a) some way to embed standardised interchangeable meta data in HTML (so that users can copy from regular web pages) b) some support in the UA for figuring out what meta data applies to a selection and, say, place three alternative formats on the clipboard: 1) text/plain 2) text/html 3) application/metasomething+xml c) support in other applications for detecting the third format on the clipboard, parsing and using it. For example, a web application might use the HTML5 clipboard data API to detect the meta data, parse it with the UA's XML parser, and figure out if it was data it could make use of. Most applications would use *both* the regular text (plain or HTML) format and the meta data. Would anyone use this? I think that actually some of the functionality we would enable here would be so compelling that users would request it. If, for example, Wikipedia - OpenOffice pasting created automatic bibliography entries users would start asking why Encyclopedia Britannica - Microsoft Word did not. If Myspace.com let you copy a selected contact and paste in some webmail or OS address book, Facebook users would start several Facebook groups trying to get it working there. -- Hallvord R. M. Steen
Re: [whatwg] embedding meta data for copy/paste usages - possible use case for RDF-in-HTML?
Hallvord R M Steen wrote: I'd like some way to add meta data to a page that could be integrated with the UA's copy/paste commands. These use cases are a good start, but the problem is that you've begun with the assumption that copy and paste would be a part of the solution. For example, if I copy a sentence from Wikipedia and paste it in some word processor, it would be great if the word processor offered to automatically create a bibliographic entry. Do you mean a bibliographic entry that references the source web site, and included information such as the URL, title, publication date and author names? That could be a useful feature, even if it could only obtain the URL and title easily. Often, when writing an article that quotes several websites, it's a time consuming process to copy and paste the quote, then the page or article title and then the URL to link to it. An editor with a Paste as Quotation feature which helped automate that would be useful. HTML5 already contains elements that can be used to help obtain this information, such as the title, article and it's associated heading h1 to h6 and time. Obtaining author names might be a little more difficult, though perhaps hCard might help. If I copy the name of one of my Facebook friends and paste it into my OS address book, it would be cool if the contact information was imported automatically. Or maybe I pasted it in my webmail's address book feature, and the same import operation happened.. I believe this problem is adequately addressed by the hCard microformat and various browser extensions that are available for some browsers, like Firefox. The solution doesn't need to involve a copy and paste operation. It just needs a way to select contact info on the page and export it to an address book. There are even web services that will parse an HTML page and output a vCard file that can be imported directly into address book programs. If I select an E-mail in my webmail and copy it, it would be awesome if my desktop mail client would just import the full E-mail with complete headers and different parts if I just switch to the mail client app and paste. Couldn't this be solved by the web mail server providing an export feature which let the user download the email as an .eml file and open it with their mail client? Again, I don't believe the solution to this requires a copy and paste operation. However, I'm not sure what problem you're trying to solve. Why would a user want to do this? Why can't users who want to access their email using a mail client use POP or IMAP? -- Lachlan Hunt - Opera Software http://lachy.id.au/ http://www.opera.com/
Re: [whatwg] Spellchecking mark III
Spell checking of regions of text should be governed by the lang attribute, if any, and browser preferences; it would be switched off for language tags the spell-checking engine does not support, including custom ones. It is extremely annoying how Safari, although (supposedly) localized to Polish, wants all input to be in English. IMHO, Chris
Re: [whatwg] Spellchecking mark III
On Tue, Dec 30, 2008 at 3:38 AM, Ian Hickson i...@hixie.ch wrote: The same engineers have since implemented this feature in Chrome also, Incorrect. One engineer implemented a crude hack in a small portion of the Chromium glue code that implements a fraction of the spec -- enough to make Gmail work a little more nicely, and that's about it. On Wed, Dec 31, 2008 at 7:15 AM, Maciej Stachowiak m...@apple.com wrote: 2) The proposal Hixie linked seems way overengineered for this purpose. First, it allows spellchecking to be explicitly turned on, potentially overriding normal defaults, but that seems wrong; an input type=email should never spellcheck regardless of the page author says. I can't see any valid use case for the author turning spellchecking on regardless of UA defaults or user preferences. Email subject line boxes. In Firefox (where I implemented support for this attribute matching Hixie's spec), the default is to spellcheck multiline boxes and not single-line boxes, which meant that webmail subject line fields would not be spellchecked by default. Second, it allows spellchecking to be controlled at a finer granularity than editability, for which again I think there is no valid use case. Besides the above example in the positive direction, the negative direction is, again, editable fields which you don't want spellchecked, e.g. email recipient list fields (which may be multiline and contain whitespace). I agree with Roc that it is not practical for UAs to detect (via heuristics) which fields should and should not be checked in all cases, and spellchecking desirability seems finer grained than editability to me (not completely orthogonal, as I don't think non-editable fields should ever be spellchecked). I also agree with Roc that this is not complicated, in practice, to implement. It was a tricky patch for me in Firefox since I was not familiar with any of the associated code, but the actual logic of the spec was not hard at all. I support adding Hixie's spec, as-is, to HTML5. It's implemented in Firefox, it's desired in Opera, and there's a bug on file to add support for it to WebKit (which I would like to do someday). PK
[whatwg] Script/parser interaction bug?
I have a test case that works in major browsers (FF, Opera, Safari, IE6) but that I don't think would work if the they followed the behavior as currently specified in HTML5. I've put the test case online: http://stakface.com/pub/mango/ext7.html The assertion document.getElementById('r').firstChild.data == 'PASS' is true after the page has loaded, whereas according to the spec I don't think it shouldn't be. The steps are roughly as follows: - tokenize/treebuild ext7.html until the first closing script tag is found (for the 7a.js script) - run the script. this sets 7a.js to be the pending external script - execute the pending external script (7a.js) since it's not a re-entrant invocation of the tree builder --- insert the 7b.js line into the input stream --- tokenize/treebuild the 7b.js script tag until the /script for 7b.js is found --- run the script. this sets 7b.js to be the pending external script --- now, since there is a pending external script and this is a re-entrant invocation, set the pause flag to true and bail --- insert the other stuff in 7a.js into the input stream --- since the parser pause flag is set this other stuff does NOT get tokenized/treebuilt yet - 7a.js finishes executing, and now we have a new pending external script, which is 7b.js - execute 7b.js --- throws - continue processing input stream (this now has the contents of the document.write calls from 7a.js, line 2 onwards) - tokenize/treebuild the input stream until the /script that was document.write'd at the bottom of 7a.js is encountered - execute the script --- insert the div into the input stream --- since the parser pause flag is still set the div does NOT get tokenized/treebuilt --- run the line that sets .firstChild.data to PASS. since the div isn't in the DOM yet, this throws and the script is done - unwind back to the treebuilder, which clears the parser pause flag since the script nesting level drops to zero - tokenize/treebuild the input stream, which contains the div tag - add div with content FAIL to the DOM - done Here, I think the pause flag needs to get cleared earlier, so that when the div is inserted into the input stream, it gets tokenized and added to the DOM. This would make the behavior consistent with what I'm seeing in major browsers. Thoughts? kats
Re: [whatwg] Spellchecking mark III
On Mon, Jan 19, 2009 at 4:53 PM, Robert O'Callahan rob...@ocallahan.orgwrote: Actually I was just poking around and noticed that we don't actually support variation of spellcheck values within different parts of an editable element. So I won't make any claims about how hard that is to support. Doesn't the spec only define things on a per-element level of granularity? I wasn't really paying attention to this side-conversation of yours so I didn't think to confirm/refute it. But I don't think the spec in fact covers doing such a thing. PK
Re: [whatwg] embedding meta data for copy/paste usages - possible use case for RDF-in-HTML?
2009/1/20 Jamie Rumbelow ja...@jamierumbelow.net: I think that the already available solution to your problem are Microformats - you are essentially embedding metadata, semantically in HTML. Of course, but I think your comment misses half of the proposed solution.. namely what format the UA puts the information on the clipboard in. If you say microformats is the solution, I assume you mean UAs should put HTML fragments with microformat-type attributes and values (mainly class) on the clipboard as text/html, and applications that were targetted by a paste operation should have HTML parsers and implement support for specific microformats. Which is why you added: Beside this, the applicability is rather specific - every application would need built in support and every website would have to markup the data in a specific way to support the application's format. This could get far too confusing and complicated... It would not necessarily need support from the website - the UA could have some logic to create associated meta data (URL, title, possibly author from META tags though that wouldn't be very reliable) for the bibliographic stuff if the page did not contain more specific meta data for this purpose. With Facebook I could write a Facebook application to generate the meta data format - Facebook would not really need to support this. With any other website I could add a User JavaScript or Greasemonkey script that was aware of that site's markup and could extract the information in a site-specific way and make it available to the UA as HTML-embedded meta data.. -- Hallvord R. M. Steen
Re: [whatwg] Alternative method of declaring prefixes in RDFa
Just a couple of clarifications - not trying to convince anybody of anything, just setting the record straight. Henri Sivonen wrote: Even though switching over to 'prefix' in both HTML and XHTML would address the DOM Consistency concern, using them for RDF-like URI mapping would as opposed to XML names would remove the issue of having to pass around compound values and putting them on the same layer on the layer cake would remove most objections related to qnames-in-content, some usual problem with Namespaces in XML would remain: * Brittleness under copy-paste due to prefixes potentially being declared far away from the use of the prefix in source. * Various confusions about the prefix being significant. There does not seem to be agreement or data to demonstrate just how significant these issues are... to some they're minor, to others major. I'm not saying it isn't an issue. It certainly is an issue, but one that was identified as having little impact. RDFa, by design, does not generate a triple unless it is fairly clear that the author intended to create one. Therefore, if prefix mappings are not specified, no triples are generated. In other words, no bad data is created as a result of a careless cut/paste operation. The author will notice the lack of triple generation when checking the page using a triple debugging tool such as Fuzzbot (assuming that they care). * The problem of generating nice prefixes algorithmically without maintaining a massive table of a known RDF vocabularies. This is a best-practices issue and one that is a fairly easy problem to solve with a wiki. Here's an example of one solution to your issue: http://rdfa.info/wiki/best-practice-standard-prefix-names * Negative savings in syntax length when I given prefix is only used a couple of times in a file. The cost of specifying the prefix for foaf, when foaf is only specified once in a document, is: len(xmlns:foaf='http://xmlns.com/foaf/0.1/') + len(foaf:) - len(http://xmlns.com/foaf/0.1/;) == 18 characters The cost of specifying the prefix for foaf, when foaf is used two times in a document is: len(xmlns:foaf='http://xmlns.com/foaf/0.1/') + len(foaf:) - len(http://xmlns.com/foaf/0.1/;)*2 == -8 characters So, in general, your setup cost is re-couped if you have more than 1 instance of the prefix in a document... which was one of the stronger reasons for providing a mechanism for specifying prefixes in RDFa. The reason that we used xmlns: was because our charter was to specifically create a mechanism for RDF in XHTML markup. The XML folks would have berated us if we created a new namespace declaration mechanism without using an attribute that already existed for exactly that purpose. The easy way to avoid accusations of inventing another declaration mechanism is not to have a declaration mechanism. URIs already have namespacing built into their structure. You seem to be taking as a given that there needs to be an indirection mechanism for declaring common URI prefixes. As far as I can tell, an indirection mechanism isn't a hard requirement flowing from the RDF data model. We did not take the @prefix requirement as a given, it was a requirement flowing from the web authoring community (the ones that still code HTML and HTML templates by hand), the use cases, as well as the RDF community. I would expect the HTML5 LC or CR comments to reflect the same requirements if WHATWG were to adopt RDFa without support for CURIEs. After all, N-Triples don't have such a mechanism. You are correct - N-Triples do not... however, Turtle, Notation 3, and RDF/XML do specify a prefixing mechanism. Each do so because it was deemed useful by the people and workgroups that created each one of those specifications. -- manu -- Manu Sporny President/CEO - Digital Bazaar, Inc. blog: Bitmunk 3.1 Website Launch http://blog.digitalbazaar.com/2009/01/16/bitmunk-3-1-website-launch
Re: [whatwg] embedding meta data for copy/paste usages - possible use case for RDF-in-HTML?
I'd like some way to add meta data to a page that could be integrated with the UA's copy/paste commands. These use cases are a good start, but the problem is that you've begun with the assumption that copy and paste would be a part of the solution. That's not a bug, it's a feature :) Ian said a while ago that coming up with end-user friendly UI ideas for the RDF stuff was harder than doing the technical work - implying that if there are no good UI ideas, browser vendors would not find a nice way to let users use metadata, and many of the use cases for embedding it in HTML would not really be feasible. Thus, this *is* a UI proposal, aiming to show that an operation nearly *all* users are familiar with could be enhanced with richer ways to embed meta data in HTML. For example, if I copy a sentence from Wikipedia and paste it in some word processor, it would be great if the word processor offered to automatically create a bibliographic entry. Do you mean a bibliographic entry that references the source web site, and included information such as the URL, title, publication date and author names? Exactly. That could be a useful feature, even if it could only obtain the URL and title easily. Often, when writing an article that quotes several websites, it's a time consuming process to copy and paste the quote, then the page or article title and then the URL to link to it. An editor with a Paste as Quotation feature which helped automate that would be useful. It would be great. I hate the clumsy back-and-forward switching to copy/paste all those bits of information ;-p HTML5 already contains elements that can be used to help obtain this information, such as the title, article and it's associated heading h1 to h6 and time. Obtaining author names might be a little more difficult, though perhaps hCard might help. Indeed. And it's not an either-or counter-suggestion to my proposal, UAs could fall back to extracting such data if more structured meta data is not available. If I copy the name of one of my Facebook friends and paste it into my OS address book, it would be cool if the contact information was imported automatically. Or maybe I pasted it in my webmail's address book feature, and the same import operation happened.. I believe this problem is adequately addressed by the hCard microformat and various browser extensions that are available for some browsers, like Firefox. The solution doesn't need to involve a copy and paste operation. It just needs a way to select contact info on the page and export it to an address book. This is way more complicated for most users. Your last sentence IMO is not an appropriate way to use the word just, seeing that you need to find and invoke an export command, handle files, find and invoke an import command and clear out the duplicated entries.. This is impossible for several users I can think of, and even for techies like us doing so repeatedly will eventually be a chore (even if we CAN, it doesn't mean that's the way we SHOULD be working). Besides, it doesn't really address the copy ONE contact's information use case well. Also, should any program that wants to support copy-and-paste of contact information have to support text/html parsing and look for class= values? I guess that would be quite some work for the rather limited functionality microformats gives you. It would be better with a microformat-aware UA generating a common meta data interchange format for the clipboard, and from there there it seems a small step to allow web page authors to embed richer meta data the UA can use to generate the clipboard meta data, right there in their HTML. If I select an E-mail in my webmail and copy it, it would be awesome if my desktop mail client would just import the full E-mail with complete headers and different parts if I just switch to the mail client app and paste. Couldn't this be solved by the web mail server providing an export feature which let the user download the email as an .eml file and open it with their mail client? Of course, that or POP/IMAP access is the way things currently work. Again, I don't believe the solution to this requires a copy and paste operation. ..but I think it would be more intuitive and user friendly if something like that worked. (Or drag-and-drop an E-mail from the webmail to the desktop client/file system/other webmail, which is basically the same thing). However, I'm not sure what problem you're trying to solve. Why would a user want to do this? Why can't users who want to access their email using a mail client use POP or IMAP? Granted this use case is a bit more far-fetched (but I know people who copy E-mails from their Outlook and paste in Windows Explorer! - for backing up or archiving a message they want to keep). -- Hallvord R. M. Steen
Re: [whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector
On Jan 18, 2009, at 8:43 AM, Shelley Powers wrote: Take you guys seriously...OK, yeah. I don't doubt that the work will be challenging, or problematical. I'm not denying Henri's claim. And I didn't claim to be the one who would necessarily come up with the solutions, either, but that I would help in those instances that I could. What I did express in the later emails, is what others have expressed who have asked about RDFa in HTML5: are we wasting our time even trying? That it seems like a decision has already been made, and we're spinning our wheels even attempting to find solutions. There's a difference between not being willing to negotiate, compromise, work the problem, and just spitting into the wind for no good. Based on past experience, I would say that you are not wasting your time. Evidence-based arguments, explication of use cases, solutions to technical problems, persuading third parties, and getting implementation traction (for example in popular JavaScript libraries, major browser engines, popular authoring/publishing software) will all affect how a feature is seen. As past examples, allowing XML-like self-closing tag syntax for void elements in text/html, and ability to include SVG inline in text/html, are both features that were highly controversial and at times opposed by the editor and others. Nontheless we seem to be on track to have both of these in the spec. Note that in the case of SVG especially, the path from initial proposal to rough consensus to actual integration with the spec was a long one. In fact, integration in the spec is not yet fully complete due to some disputes about the details of the syntax. Another example is the headers attribute, and the more general issue of header association in tables. Though the headers attribute was controversial and once opposed by the editor, it is now in the spec. I believe that most of us here, while we may have our biases and preconceptions, will evaluate concrete technical arguments in good faith, and are prepared to change our minds. The fact is that people have changed positions in the past, Ian included. So nothing should be assumed to be a done deal, especially at this early stage of exploring metadata embedding and RDFa. However, the debate ended as soon as Ian re-asserted his authority. Ian just gave an indication of when he's going to work on this again. That doesn't mean that research into e.g. DOM consistency can't happen meanwhile. It also doesn't mean that debate needs to stop. No, Ian's listing of tasks pretty much precluded any input into the decision making process other than his own. I never see we when Ian writes, I only see I. Ian intends to make an evaluation based on evidence and arguments presented. Presenting such evidence and arguments is input into the decision making process. That's how other changes to the spec that went against Ian's initial gut instinct happened. Indeed it is possible for Ian to be overruled if he is clearly blocking the consensus of the group(*), but so far that has not been necessary, even on controversial issues. I encourage you to provide input into the process, and not to get too frustrated if the process is not quick. Nor by the fact that some may initially (or even finally, when all is said and done) disagree with you. Regards, Maciej * - The HTML WG can take a vote which is binding at least in the W3C context or remove Ian as editor; and the WHATWG oversight group can remove Ian as editor or pressure him by virtue of having the authority to remove him.