On 6/5/06 10:27 AM, "Michael Leikam" <[EMAIL PROTECTED]> wrote:
> Tantek, > > Thanks. You're obviously more familiar with designing data > formats than I am, but could you (or anybody else really) > explain a little I'm guessing we're going to want to add the rest of this to the FAQ. > more about the differences you see between > supporting DOM manipulation during the parsing, as I've > suggested, and supporting include-patterns? They are completely different. How do you see them as being the same? include-patterns are simply transclusion - not a manipulation at all. > To me include-patterns seem like a subset of DOM Would you also say that the <img> element is a subset of the DOM? How about just the <object> element by itself which allows for implementations to "fall back" to its contents if the referenced data is not supported? The biggest difference here is the difference between declarative and procedural processing. Just because procedural processing can mimic declarative processing doesn't mean that declarative processing is procedural. > and both > seem less to do with the data format itself than the > inherently procedural transformation from one format to > another. includes are not procedural. includes are merely aggregation. > What is the difference between defining a data > format and defining what people do with that data format > (i.e., what that data format is used for)? Defining a data format defines a syntax, grammar, and semantics, as well as often an abstract model of the data. That's very different than trying to define all possible applications for that format. Nor does defining how to parse a format from the syntax using the grammar into the abstract model imply that you are defining applications for that format. > I do see the benefit of having an <object> within a > microformatted block of content. There isn't a process > that needs to run in order for the syntax to reflect the > included data. That's a good summary. You don't have to perform some Turing-complete embedded computation in order to extract the semantics. > It's also more scanable by human eyes than > a block of javascript or xsl. This is always a HUGE plus for microformats. > But in order for the parser > to generate the target format, you've defined this > procedure: > --------- > if class is "include", grab the referenced node including > descendants and replace the current node with the > referenced one. > --------- Yes, that is a simple way of defining how to process object-includes. The key here is, that that is within the context of *parsing* the microformat. Parsing is already a well defined process and this is simply adding another detail to it. This is very different than adding say, a virtual machine that processes arbitrary loops and conditionals. > I guess at root I'm unclear about whether maxims regarding > data formats also apply to data parsing? What distinction do you see between data formats and data parsing? A well defined data format includes enough syntax/grammar details that the data parsing is 100% deterministic from the data format. > The parallel I > see is microformat:parsing::XML:DOM. You want to avoid > procedural rules in the X(HT)ML, but the DOM exists to > formalize them. Is that faulty? Yes that is a faulty analogy. Part of the problem is the ambiguity of what people mean when they use the phrase "the DOM". 1. If by "the DOM" you merely mean the abstract node and attribute structure, then yes, parsing microformats also gives you a "DOM" which you can then further process however you see fit. In fact, this is what DOM *literally* means: document object *model*. 2. If however by "the DOM" you mean a data abstraction + a set of predefined methods/functions/procedures which you can apply to that abstraction, then the answer is no. Unfortunately most people seem to conflate these two definitions and/or assume that #2 is the only definition because the W3C "DOM" specs all define both a model and a set of APIs. If you want to try distinguishing these to avoid confusion, perhaps call the abstraction "DOM Model" (which I realize is redundant, but unfortunately the emphasis is necessary due to the confusion that it seems most web programmers have), and the methods/functions/procedures "DOM API". Others have tried to introduce the term "Infoset"[1] to mean the "model" since the "model" has been subsumed to include APIs. However, I have found that in practice, the term "Infoset" has very low comprehensibility. > The sort of markup I had in mind was something like this: > --------- > <div id="company"> > <div class="hcard"> > <h1 class="fn org">Michael's Webby Widgets</h1> > <div class="adr"> > <span class="locality">Los Angeles</span> > </div> > </div> > </div> > <div class="hcard" onUFparseEvent="add_org_and_city()"> > <div class="fn">Michael Leikam</fn> > <a class="email" href="mailto:[EMAIL PROTECTED]"> > </div> The problem is, with this bit of code: onUFparseEvent="add_org_and_city()" you just added: * an event model * a functional programming model * a parameter model (clearly those parentheses are there for a reason) This very much falls in the category of using a steamroller to swat a fly. Not only is it overkill, but in practice, extremely cumbersome to do so. > I don't really want to include the entire div#company since > it includes fields that already exist in my personal block, > e.g., "fn". Actually, you can do that, since the parsing rules for singleton properties are to merely ignore latter instances. Thus just make sure you include the property declarations in your personal block first and those will be found first. > Adding an ID to span.locality, which I think > is how include-pattern wants to handle this, isn't > appealing because I'd want to use a generic hcard generator > for any contact information. Why? Seems like you are trying to do some pretty fancy interconnected hCard stuff - to then add the unnecessary constraint of using "a generic hcard generator" makes no sense. It is trivial to add ID attributes as needed, even after the fact. A better approach to take would be to point (URL) to a real world example you are trying to markup which actually *has* these issues, so we can all take a look and figure out what to do with it. Abstract examples don't merit much discussion around here. > I also see the benefit of advocating limited solutions for > real problems. That's a very good goal. It's not just a goal for microformats, it is a core principle. We've rejected far more features that were far less abstract. We not only limit to solutions for real problems. We limit to the 80% case and punt on the 20% case (which makes people who want those 20% cases upset, but the alternative is to double or triple the amount of time to try to get things done). > I really wasn't > expecting the community here to say "oh, ok, we'll add DOM > support tomorrow." DOM Model definition is a reasonable request, and is essentially what the "parsing" documentation defines. DOM API - don't expect to ever see anything in that regard. There should be no additional DOM API for microformats above and beyond the context where they are already used, and that is outside the definition of microformats and their processing model (e.g. microformats in HTML can be manipulated with HTML DOM APIs by HTML user agents that choose to support HTML DOM APIs). > There's clearly more in terms of > thought and use-cases that would need to go into deciding > whether it's actually a good solution to real problems we > face in marking up our content. But from the replies I've > gotten, it sounds like this is the beginning of a > discussion and not something that is already ongoing. It is neither ongoing, nor IMHO the beginning of a discussion. We're not adding anything procedural to microformats. They (procedural additions) are unnecessary, and introduce SO MANY problems in the processing model (e.g. viruses, security problems etc.) as to be not worth it. For more on that, see Tim Berner-Lee's "What not to email" here: http://www.w3.org/People/Berners-Lee/#Before and note all the examples he lists. As I said, this is a well known problem in data format design. Note that when Tim says you can send him HTML, he knows he can turn off javascript (or run a reader that doesn't bother to implement it at all), and he will still be able to view the content just fine (assuming it is properly authored with XHTML+CSS). Thanks, Tantek [1] http://www.w3.org/TR/xml-infoset/ _______________________________________________ microformats-discuss mailing list [email protected] http://microformats.org/mailman/listinfo/microformats-discuss
