[uf-discuss] Appeal for Issues: Empty spans in value-excerption-pattern

Ben Ward Thu, 06 Nov 2008 02:01:52 -0800

Hi everyone.

So, a few months ago I was working on the ongoing value-excerption-pattern specification. Then I moved to San Francisco and my work wenta little stagnant, but I'm trying to pick it up again.

The value-excerption-pattern is an attempt to fully spec theclass="value" behaviour from "tel" in hCard, which has since beensupported globally in some parsers for a while, and has provedsomewhat useful. In addition to fully spec'ing the behaviour forparsing class="value" elements for visible data, I've been working onadditional specification to handle inclusion of machine-centric dataalongside human forms (http://microformats.org/wiki/machine-data).

It's this machine-centic portion that I'm trying to nail down at themoment, since it would provide an in-demand solution for variousrecurring complaints (abbr-pattern dependencies, for example).

Also, note that recent brainstorming regarding patterns dervice fromthe semantics of the <object> element and value excerption has shownthat current, in-use browsers (Microsoft Internet Explorer andApple's Safari 2) do not handle object acceptably for inline content (http://microformats.org/wiki/value-excerption-pattern-brainstorming#object_param_handling). So we're definitely stuck with needing to spec this pattern usinggeneric mark-up. (http://microformats.org/wiki/value-excerption-pattern-brainstorming#object_param_handling)

Since it's been a while, this mail serves to summarise the currentstate of this spec and proposed resolutions to open issues. PLEASE, ifyou have additional issues to raise, add them to the wiki page (http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements)


Couple of Examples:
----------------------------

 11:25pm, August 27th 2008


 <p class="tel">

Mobile

   <span class="value">415-123-4567</span>
 </p>


Purpose
-----------

This pattern allows you to embed fixed format content — such as thetelephone type enumeration and parser-required data formats —alongside the visible format of the publisher's choice.


Responses to Issues so Far
--------------------------------------

1. DRY Violation worse than current ABBR-pattern. DRY is a problemwhen data is repeated in a document and risks one copy of the data notbeing maintained in sync with another. Maintenance of the documentresults in broken data.

Resolution: To address this, the empty-span part of the valueexcerption pattern will specify that the empty-span MUST be the first,non-whitespace-text-node child of the property element. Thus, thiswill parse:

4th November


But this will fail:

On 4th November 2008 Barack Obama was electedthe first African American president of the United States of American.He was really pleased about it. 

The first pattern keeps the code distance small between the data form(class=value) and the property name (class=dtstart). It disallows themachine-data portion from being separated from the property.

Furthermore, the spec should encourage conformance checking tools toattempt to verify the machine date form against the human form andwarn the user if they data does not match.


2. Violating the principal of visible data

Resolution: Microformats maintain a principal of marking up visibledata. However, we have exceptional circumstances where the datarequired for parsing is not the data that publishers wish to display.Whilst parsers are a lower priority than publishers, the cost andcomplexity of parsing unstructured dates, or translated terms, isaccepted as too high. Therefore it is necessary to violate DRY toinclude explicit representations for machines.

Currently authors may use CSS to hide the machine-form of dates.Microformats exists only in the HTML layer, and must not depend on CSSto meet publisher requirements.

The specification may also restrict this part of the pattern tocertain properties where a machine-data form is required, as a meansto discourage abuse.


3. Broken parsers drop empty elements

There are some broken but widespread HTML parsers which discard emptyelements, resulting in the empty-span-value element being removed fromdocuments (e.g. HTMLTIdy). HTMLTidy is easily patched not to do this,but may already exist in publishing platforms.

Resolution: Without numbers, we don't know how many publishing systemswould be affected but this. It's a problem for which the onlyresolution is to use a completely different pattern. As such, thisproposal must put legacy broken parsers down as an accepted loss.CMS's locked to old versions of HTML Tidy would not be able to usethis pattern without modification.

So, there aren't many issues against this part of the pattern, and therules for it are coming together. There's likely some feeling aboutmatters of taste as to how to achieve this function. This is myfavoured version, but a lot of the issues resolved here would applyequally to other patterns too, so I'd appreciate further input to seeif this pattern can be thoroughly specified.

Please, if you have problems to raise with this proposal, add them tothe -issues page on the wiki at:


http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements

Thank you,

Ben
_______________________________________________
microformats-discuss mailing list
[email protected]
http://microformats.org/mailman/listinfo/microformats-discuss

[uf-discuss] Appeal for Issues: Empty spans in value-excerption-pattern

Reply via email to