Re: [uf-discuss] a question about concatenation and hAtom entry content

Ben Wiley Sittler Sat, 02 Jun 2007 06:31:25 -0700

On 6/1/07, David Janes <[EMAIL PROTECTED]> wrote:

On 6/1/07, Ryan King <[EMAIL PROTECTED]> wrote:
> On May 31, 2007, at 11:29 AM, David Janes wrote:
>
> > On 5/31/07, Ryan King <[EMAIL PROTECTED]> wrote:
> >
> >> Another option is that entry content is:
> >>
> >> <p class="entry-content">Content</p>
> >> <p class="entry-content">More Content</p>
> >>
> >>
> >> Is there a reason why hAtom as currently spec'ed only does text, not
> >> markup?
> >
> > I thought it did markup! I totally see what you are saying here
> > though; the question here is whether we include the DOM nodes that
> > specify entry-content. This isn't in the spec, and you wouldn't want
> > to do it everywhere (entry-title, for example) but it would make sense
> > if it did.
>
> You're right, I'm suggesting that only for entry-content (and maybe
> entry-summary) that we take the nodes that have the class name on
> them. The reason? I've seen this several times:
>
> <... class="hentry">
>   ...
>   <p class="entry-content">...</p>
>
>   <p class="entry-content">...</p>
>
> </>
>
> It makes sense, to me, to put the paragraph nodes, intact, in the
> content.


I concur. Time to start ramping up for hAtom 0.2, if I can get some
blocks of free time.

Regards, etc...


why not do this for the entry title, too? accroding to the atom spec,
this can contain markup too (and in my experience, often does.)

and yes, having some well-defined rules for xhtml → text flattening
would be good (not just for microformats, but for xhtml apps
generally.) here are the ones i use:

1. ignore content of the following elements: script, style, textarea, title

2. use the alt text as the text for img elements

4. normalize all runs of one or more whitespace to a single space in
all elements that do not have an encestral pre, xmp, plaintext, or
listing element

3. insert breaks before and after the following elements: br, p, div,
hr, h1, h2, h3, h4, h5, blockquote, address, table, tr, td, form, pre,
xmp, listing, ol, ul, menu, dir, li, dl, dt and dd

still to do:

4. table layout algorithm

5. conversion of content inside sup or sub to corresponding unicode
characters where possible, but only when the entire non-whitespace sub
or sup content can be converted. this would include e.g. <sup>TM</sup>
→ ™ and <sup>2</sup> → ²

-ben

_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

Re: [uf-discuss] a question about concatenation and hAtom entry content

Reply via email to