Hi everyone.

So, a few months ago I was working on the ongoing value-excerption- pattern specification. Then I moved to San Francisco and my work went a little stagnant, but I'm trying to pick it up again.

The value-excerption-pattern is an attempt to fully spec the class="value" behaviour from "tel" in hCard, which has since been supported globally in some parsers for a while, and has proved somewhat useful. In addition to fully spec'ing the behaviour for parsing class="value" elements for visible data, I've been working on additional specification to handle inclusion of machine-centric data alongside human forms (http://microformats.org/wiki/machine-data).

It's this machine-centic portion that I'm trying to nail down at the moment, since it would provide an in-demand solution for various recurring complaints (abbr-pattern dependencies, for example).

Also, note that recent brainstorming regarding patterns dervice from the semantics of the <object> element and value excerption has shown that current, in-use browsers (Microsoft Internet Explorer and Apple's Safari 2) do not handle object acceptably for inline content (http://microformats.org/wiki/value-excerption-pattern-brainstorming#object_param_handling ). So we're definitely stuck with needing to spec this pattern using generic mark-up. (http://microformats.org/wiki/value-excerption-pattern-brainstorming#object_param_handling )

Since it's been a while, this mail serves to summarise the current state of this spec and proposed resolutions to open issues. PLEASE, if you have additional issues to raise, add them to the wiki page (http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements )

Couple of Examples:
----------------------------

<span class="dtstart"><span class="value" title="2008-08-27T23:25:00-0700"></span> 11:25pm, August 27th 2008</ span>

 <p class="tel">
<span class="type"><span class="value" title="cell"></span> Mobile</span>
   <span class="value">415-123-4567</span>
 </p>


Purpose
-----------

This pattern allows you to embed fixed format content — such as the telephone type enumeration and parser-required data formats — alongside the visible format of the publisher's choice.

Responses to Issues so Far
--------------------------------------

1. DRY Violation worse than current ABBR-pattern. DRY is a problem when data is repeated in a document and risks one copy of the data not being maintained in sync with another. Maintenance of the document results in broken data.

Resolution: To address this, the empty-span part of the value excerption pattern will specify that the empty-span MUST be the first, non-whitespace-text-node child of the property element. Thus, this will parse:

<span class="dtstart"><span class="value" title="2008-11-04"></ span>4th November</span>

But this will fail:

<span class="dtstart">On 4th November 2008 Barack Obama was elected the first African American president of the United States of American. He was really pleased about it. <span class="value" title="2008-11-04"></span> </span>

The first pattern keeps the code distance small between the data form (class=value) and the property name (class=dtstart). It disallows the machine-data portion from being separated from the property.

Furthermore, the spec should encourage conformance checking tools to attempt to verify the machine date form against the human form and warn the user if they data does not match.

2. Violating the principal of visible data

Resolution: Microformats maintain a principal of marking up visible data. However, we have exceptional circumstances where the data required for parsing is not the data that publishers wish to display. Whilst parsers are a lower priority than publishers, the cost and complexity of parsing unstructured dates, or translated terms, is accepted as too high. Therefore it is necessary to violate DRY to include explicit representations for machines.

Currently authors may use CSS to hide the machine-form of dates. Microformats exists only in the HTML layer, and must not depend on CSS to meet publisher requirements.

The specification may also restrict this part of the pattern to certain properties where a machine-data form is required, as a means to discourage abuse.

3. Broken parsers drop empty elements

There are some broken but widespread HTML parsers which discard empty elements, resulting in the empty-span-value element being removed from documents (e.g. HTMLTIdy). HTMLTidy is easily patched not to do this, but may already exist in publishing platforms.

Resolution: Without numbers, we don't know how many publishing systems would be affected but this. It's a problem for which the only resolution is to use a completely different pattern. As such, this proposal must put legacy broken parsers down as an accepted loss. CMS's locked to old versions of HTML Tidy would not be able to use this pattern without modification.




So, there aren't many issues against this part of the pattern, and the rules for it are coming together. There's likely some feeling about matters of taste as to how to achieve this function. This is my favoured version, but a lot of the issues resolved here would apply equally to other patterns too, so I'd appreciate further input to see if this pattern can be thoroughly specified.

Please, if you have problems to raise with this proposal, add them to the -issues page on the wiki at:

http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements

Thank you,

Ben
_______________________________________________
microformats-discuss mailing list
[email protected]
http://microformats.org/mailman/listinfo/microformats-discuss

Reply via email to