On 21/05/2011 8:12 AM, Andy Lester wrote: >> My reason for raising this now is that the HTML5 spec elements are starting >> to appear on documents on the internet, and with these tags missing from >> HTML::Tagset, it impacts the effectiveness of other libraries that depend >> upon it. > Haven't looked at the patch, but I think we need to talk about high-level > thoughts about handling HTML5. > * Should we have separate sets of tags for HTML 4 and 5? If this is separated at HTML 4 (and below) versus HTML 5, then a lot of separate modules that rely on HTML Tagset being all HTML would potentially need to be reworked to cater for this.
> * If so, should it be handled be one module, or should it get split into to? ... or potentially one per major version of HTML (2/3/4/5)? Potentially HTML-Tagset could be "all HTML tags", and it could itself be composed of sub-modules "HTML::Tagset::HTML5", "HTML::Tagset::HTML, "HTML::Tagset::HTML3", etc? But that could be a longer term goal to re-architect perhaps? What advantage does this give - Validating HTML docs against a specific release? I wonder if the various version-libraries could be generated from the formal SGML DTD? And is it worth the effort? > * Should we have strict and loose definitions of tags, going forward? One of > the things I've run into is tag attributes that are recognized by browsers, > but not in the spec. I suspect that other modules (HTML::TreeBuilder et al) rely on lose definitions in order to deal with real world situations of badly coded documents that also don't meet the spec. I'd hate to break them and regress. James -- /Mobile:/ +61 422 166 708, /Email:/ james_AT_rcpt.to PLUG President 2011: http//www.plug.org.au Perth.pm Organiser 2011: http://perth.pm.org