Re: [whatwg] summary tag to help avoid redundancy of meta description tag
On Thu, 18 Mar 2010, Roger H�gensen wrote: On my own site currently I mostly replicate the first paragraph of an article in my journal as the meta description, and write one up for other pages, usually replicating some of the content. I'm both looking for and want a solution to avoid such redundancy. The simplest solution is to just not include a description, and rely on tools to determine automatically what the most relevant information on the page is. The perfect solution would be a summary tag, if you look at the journal articles on my site you can imagine the first paragraph being done like this: psummaryThis is just an example, it's a replacement for the old meta description, and is a brief summary (description) of the page (content)/summary/p This way the first paragraph in a page would remain unchanged from how it is done today, and a search engine like Google or screen readers etc. would use the summary tag instead of the meta description (which is no longer needed at all in cases like this), if more than one summary tag the first is considered the page summary one, while the others are ignored (but still shown as content obviously). That, or an attribute, would be a reasonable solution, but I'm not really convinced the problem is that important. On Thu, 18 Mar 2010, Roger H�gensen wrote: Example using HTML5 microdata: (would this be appropriate, would browser devs, and Google and other search engines support this?) You _could_ use microdata to do this, but I don't think it's really a great use of microdata. This kind of thing would be better done as a microformat, e.g. using a well-known class value. On Fri, 19 Mar 2010, Ashley Sheridan wrote: Why not just use server-side code to output the first paragraph of content as the description for the page also? That is indeed another possible solution to avoid hand-authoring duplicate content. On Fri, 19 Mar 2010, Roger H�gensen wrote: http://lists.w3.org/Archives/Public/public-html/2009Aug/0990.html suggests link rel=description href=#desc /, which is ok I guess. But why not simply allow this instead: meta name=description href=#desc / Existing parsers would notice that content= is missing which is stated as being required, parsers that have been updated would notice there is a href= instead, so search engines could just look for that id in the page. I think this would have the highest success rate. If backwards compatibility is such a major concern then this could be done: meta name=description content= href=#desc / I'm unsure what gives the best result for varous parsers though, would empty content make them behave the same as if the meta tag was not there at all? Or would a empty tag cause them to use as the actual page description? I'd prefer to have the content attribute missing instead myself, but... link is the right element for links, meta for text data. Either way, though, the right way to address this is to convince implementors (such as a search engine developer) that they should follow these links and get the description from them. That is an early step in changing the spec: http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_a_specification.3F On Thu, 18 Mar 2010, Roger H�gensen wrote: [regarding data-*=] Maybe a better naming would have been: doc-* It's short, it kinda reflect what it's related to as well right? Or does that clash with something? data-*= is probably too well established to change at this point unless there's a really compelling reason. Cheers, -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On 2010-03-19 17:19, Roger Hågensen wrote: On 2010-03-19 15:43, Ashley Sheridan wrote: On Fri, 2010-03-19 at 15:43 +0100, Roger Hågensen wrote: On 2010-03-19 15:17, Ashley Sheridan wrote: I just feel that thehead andbody areas of a page have two distinct uses, and unnecessary crossovers shouldn't occur if it's avoidable. If you look at my other thread Re: [whatwg]meta name=description href=#desc / It allows notifying the parser that the content is in the page, and it is up to the parsers configuration whether to scan beyond the header in that case. Best of both worlds IMO. Roger. I did see that, and it looks like a great idea, as it shouldn't really break anything, and I saw that it should be possible to use for the keywords too, which would fit perfectly with tag cloud systems used on a page. I would presume that this would cause the content parser (browser) to strip any and all tags surrounding the marked content? Thanks, Ash http://www.ashleysheridan.co.uk Well, looking at the example http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-March/025575.html I remebeerd that thew title element may have html markup in it (seen it in the wild), so most parsers probably apply tag stripping to that already, so yeah, stripping tags the parser do not want shouldn't be an issue really. Just made a feature request article at http://wiki.whatwg.org/wiki/Meta_element_href as it's just easier to reference that than a mailing list post. Sorry if it looks messy, I just used the advised template, but it's a start at least. If anyone feel like improving the language feel free to go nuts. Roger. -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On 2010-03-18 10:04, Ashley Sheridan wrote: The main problem with that would be that parsers would then need to read into the body of the page to produce a description of your site. This might not produce much of an overhead on a one-off basis, but imagine a parser that is grabbing the description from hundreds or thousands of pages, then this could become a bit of a problem. I do not see how that is any more or less of an problem than today with pages that have meta description missing, what do those parsers do then? Do they stop at /head ? What do they use as description instead? The first paragraph? The parsers used by all major search engines certainly do not halt, they break down the entire page right? As for delays, that is not an issue for consumers, I can not recall any browser ever showing me the meta description unless I explicitly view page properties. I can imagine that the seeing impaired community would love something like this, as it would basically tell screenreaders that this is the first paragraph/summary/description/teaser of the page, allowing blind people to more rapidly jump from page to page. Currently the meta description is not always good content, would be interesting to see a Google analysis of how the meta description is used, i.e. how many are basically repeating page content (like I do) and how many just dump keywords in there, how many pages on a site have a site wide identical description? And so on. Roger. -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On Fri, 2010-03-19 at 13:43 +0100, Roger Hågensen wrote: On 2010-03-18 10:04, Ashley Sheridan wrote: The main problem with that would be that parsers would then need to read into the body of the page to produce a description of your site. This might not produce much of an overhead on a one-off basis, but imagine a parser that is grabbing the description from hundreds or thousands of pages, then this could become a bit of a problem. I do not see how that is any more or less of an problem than today with pages that have meta description missing, what do those parsers do then? Do they stop at /head ? What do they use as description instead? The first paragraph? The parsers used by all major search engines certainly do not halt, they break down the entire page right? As for delays, that is not an issue for consumers, I can not recall any browser ever showing me the meta description unless I explicitly view page properties. I can imagine that the seeing impaired community would love something like this, as it would basically tell screenreaders that this is the first paragraph/summary/description/teaser of the page, allowing blind people to more rapidly jump from page to page. Currently the meta description is not always good content, would be interesting to see a Google analysis of how the meta description is used, i.e. how many are basically repeating page content (like I do) and how many just dump keywords in there, how many pages on a site have a site wide identical description? And so on. Roger. -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/ Search engines and people are not the only content parsers. Sure, you would expect a parser to maybe look further into the content if the description meta tag was missing, but imagine if a parser had to do this for all the content it looked at? There are still overheads to consider. Why not just use server-side code to output the first paragraph of content as the description for the page also? I just feel that the head and body areas of a page have two distinct uses, and unnecessary crossovers shouldn't occur if it's avoidable. Thanks, Ash http://www.ashleysheridan.co.uk
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On 2010-03-19 15:17, Ashley Sheridan wrote: Search engines and people are not the only content parsers. Sure, you would expect a parser to maybe look further into the content if the description meta tag was missing, but imagine if a parser had to do this for all the content it looked at? There are still overheads to consider. Why not just use server-side code to output the first paragraph of content as the description for the page also? I just feel that the head and body areas of a page have two distinct uses, and unnecessary crossovers shouldn't occur if it's avoidable. True, but there is also such a thing as uneeded redundancy, sure repeating the same info in the meta tags which is also in the document may not add that many KB, but with increasing number of page requesters that really pile up the bandwidth total. Something both users and hosters and ISPs should have an interest in right? If you look at my other thread Re: [whatwg] meta name=description href=#desc / It allows notifying the parser that the content is in the page, and it is up to the parsers configuration whether to scan beyond the header in that case. Best of both worlds IMO. Roger. -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On Fri, 2010-03-19 at 15:43 +0100, Roger Hågensen wrote: On 2010-03-19 15:17, Ashley Sheridan wrote: Search engines and people are not the only content parsers. Sure, you would expect a parser to maybe look further into the content if the description meta tag was missing, but imagine if a parser had to do this for all the content it looked at? There are still overheads to consider. Why not just use server-side code to output the first paragraph of content as the description for the page also? I just feel that the head and body areas of a page have two distinct uses, and unnecessary crossovers shouldn't occur if it's avoidable. True, but there is also such a thing as uneeded redundancy, sure repeating the same info in the meta tags which is also in the document may not add that many KB, but with increasing number of page requesters that really pile up the bandwidth total. Something both users and hosters and ISPs should have an interest in right? If you look at my other thread Re: [whatwg] meta name=description href=#desc / It allows notifying the parser that the content is in the page, and it is up to the parsers configuration whether to scan beyond the header in that case. Best of both worlds IMO. Roger. I did see that, and it looks like a great idea, as it shouldn't really break anything, and I saw that it should be possible to use for the keywords too, which would fit perfectly with tag cloud systems used on a page. I would presume that this would cause the content parser (browser) to strip any and all tags surrounding the marked content? Thanks, Ash http://www.ashleysheridan.co.uk
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On 2010-03-19 15:43, Ashley Sheridan wrote: On Fri, 2010-03-19 at 15:43 +0100, Roger Hågensen wrote: On 2010-03-19 15:17, Ashley Sheridan wrote: Search engines and people are not the only content parsers. Sure, you would expect a parser to maybe look further into the content if the description meta tag was missing, but imagine if a parser had to do this for all the content it looked at? There are still overheads to consider. Why not just use server-side code to output the first paragraph of content as the description for the page also? I just feel that thehead andbody areas of a page have two distinct uses, and unnecessary crossovers shouldn't occur if it's avoidable. True, but there is also such a thing as uneeded redundancy, sure repeating the same info in the meta tags which is also in the document may not add that many KB, but with increasing number of page requesters that really pile up the bandwidth total. Something both users and hosters and ISPs should have an interest in right? If you look at my other thread Re: [whatwg]meta name=description href=#desc / It allows notifying the parser that the content is in the page, and it is up to the parsers configuration whether to scan beyond the header in that case. Best of both worlds IMO. Roger. I did see that, and it looks like a great idea, as it shouldn't really break anything, and I saw that it should be possible to use for the keywords too, which would fit perfectly with tag cloud systems used on a page. I would presume that this would cause the content parser (browser) to strip any and all tags surrounding the marked content? Thanks, Ash http://www.ashleysheridan.co.uk Well, looking at the example http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-March/025575.html I remebeerd that thew title element may have html markup in it (seen it in the wild), so most parsers probably apply tag stripping to that already, so yeah, stripping tags the parser do not want shouldn't be an issue really. Roger. -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On 2010-03-18 03:37, Roger Hågensen wrote: I know, replying to myself is a big no-no... *cough* I searched the list, and looked at the HTML5 briefly and found nothing, nor can I ever recall such. So this is both a question and a proposal. On my own site currently I mostly replicate the first paragraph of an article in my journal as the meta description, and write one up for other pages, usually replicating some of the content. I'm both looking for and want a solution to avoid such redundancy. I kept searching after posting that and looked more into HTML5 and microdata... Besides a small anurism while trying to understand the darn thing I did find a possible solution, but is it valid? Example using HTML5 microdata: (would this be appropriate, would browser devs, and Google and other search engines support this?) The following... !doctype html html lang=en head meta charset=utf-8 / titleMicrodata replacing metadata example./title /head body article headerSection header./header p itemprop=#descriptionThis is the first paragraph in the document or an aside or some other content perhaps./p pMore content here./p footerAuthor: a href=example.com/author/url/ itemprop=#authorRoger Hågensen/a on time datetime=2010-03-18T08:00:00 itemprop=#date18th March 2010 at 8 o'clock./timebr / span itemprop=#copyright© Roger Hågensen 2010/spanbr / Keywords: span itemprop=#keywordsa href=http://example.com/tag/Example/;Example/a, a href=http://example.com/tag/Microdata/;Microdata/a, a href=http://example.com/tag/HTML5/;HTML5/a/span/footer /article /body /html replaces this... !doctype html html lang=en head meta charset=utf-8 / meta name=description content=This is the first paragraph in the document or an aside or some other content perhaps. / meta name=author content=Roger Hågensen / meta name=date content=2010-03-18T08:00:00 / meta name=copyright content=© Roger Hågensen 2010 / meta name=keywords content=Example, Microdata, HTML5 / titleMicrodata replacing metadata example./title /head body article headerSection header./header pThis is the first paragraph in the document or an aside or some other content perhaps./p pMore content here./p footerAuthor: a href=example.com/author/url/Roger Hågensen/a on time datetime=2010-03-18T08:00:0018th March 2010 at 8 o'clock./timebr / span© Roger Hågensen 2010/spanbr / Keywords: spana href=http://example.com/tag/Example/;Example/a, a href=http://example.com/tag/Microdata/;Microdata/a, a href=http://example.com/tag/HTML5/;HTML5/a/span/footer /article /body /html itemprop=#description would basically need to be reserved in some standards document, I just used the # arbitrarily to indicate this document in this example. -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On Thu, 2010-03-18 at 03:37 +0100, Roger Hågensen wrote: I searched the list, and looked at the HTML5 briefly and found nothing, nor can I ever recall such. So this is both a question and a proposal. On my own site currently I mostly replicate the first paragraph of an article in my journal as the meta description, and write one up for other pages, usually replicating some of the content. I'm both looking for and want a solution to avoid such redundancy. The perfect solution would be a summary tag, if you look at the journal articles on my site you can imagine the first paragraph being done like this: psummaryThis is just an example, it's a replacement for the old meta description, and is a brief summary (description) of the page (content)/summary/p This way the first paragraph in a page would remain unchanged from how it is done today, and a search engine like Google or screen readers etc. would use the summary tag instead of the meta description (which is no longer needed at all in cases like this), if more than one summary tag the first is considered the page summary one, while the others are ignored (but still shown as content obviously). If a new tag is overkill for this, maybe doing it this way instead (using one of the new HTML5 tags): pheader summaryThis is just an example, it's a replacement for the old meta description, and is a brief summary (description) of the page (content)/header/p I really do not care how this is implemented/speced just as long as it's possible to do. I began thinking of this recently when it annoyed me that I basically had to enter the same content twice, after looking at my site links in Google, and thought to myself...Why do I have to use a meta description to tell Google to show the content in the first paragraph as the default summary of the page link? Why can't I simply specify that the first paragraph is the page's meta description? Why am I forced to bloat the page unnecessarily like this? Thee is no reason why the meta description can not be the actual content as in most cases I've seen the meta description is supposed to be fully human readable, unlike the meta keywords which no search engines bothers with at all any more. So if the meta description is supposed to be humanly readable and displayable as the page summary to humans in search results, why can't it also actually be in the page content? I can see at least two ways this will be used. The more elegant way I showed, where the first paragraph is a summary/the lead in of the page (and also happens to be the teaser content in my RSS feed as well), or at the bottom of a page with possibly linked category tags or similar within it, again allowing dual purpose and reduced redundancy. To re-iterate, the idea of the summary tag (or however it is implemented) should be to have a human readable summary (or teaser as may be) of a page, which is itself shown in the page, but also a replacement for search engines that use the old meta description avoiding redundancy. End result is (hopefully) less redundancy, and higher quality summary (page description) shown in search engine results, and so on. Also allowing people to quickly understand what a page is about by just reading the first paragraph (or be enticed to read more). Now if something like this allready exist/is possible I stand corrected and ask, please tell me how to do that. If not then I'd love to see something like this standardized. BTW! The text in the first paragraph of this very email could for example be the summary/description of this email. So if it was html tagged in some way, a mail indexing or search engine could use that as the summary or description view shown to a human user scrolling through archived emails. Regards, Roger. The main problem with that would be that parsers would then need to read into the body of the page to produce a description of your site. This might not produce much of an overhead on a one-off basis, but imagine a parser that is grabbing the description from hundreds or thousands of pages, then this could become a bit of a problem. Thanks, Ash http://www.ashleysheridan.co.uk
Re: [whatwg] summary tag to help avoid redundancy of meta description tag!?
On 18.03.2010 03:37, Roger Hågensen wrote: I searched the list, and looked at the HTML5 briefly and found nothing, nor can I ever recall such. So this is both a question and a proposal. On my own site currently I mostly replicate the first paragraph of an article in my journal as the meta description, and write one up for other pages, usually replicating some of the content. ... See related W3C bug: http://www.w3.org/Bugs/Public/show_bug.cgi?id=7577. Best regards, Julian
[whatwg] summary tag to help avoid redundancy of meta description tag!?
I searched the list, and looked at the HTML5 briefly and found nothing, nor can I ever recall such. So this is both a question and a proposal. On my own site currently I mostly replicate the first paragraph of an article in my journal as the meta description, and write one up for other pages, usually replicating some of the content. I'm both looking for and want a solution to avoid such redundancy. The perfect solution would be a summary tag, if you look at the journal articles on my site you can imagine the first paragraph being done like this: psummaryThis is just an example, it's a replacement for the old meta description, and is a brief summary (description) of the page (content)/summary/p This way the first paragraph in a page would remain unchanged from how it is done today, and a search engine like Google or screen readers etc. would use the summary tag instead of the meta description (which is no longer needed at all in cases like this), if more than one summary tag the first is considered the page summary one, while the others are ignored (but still shown as content obviously). If a new tag is overkill for this, maybe doing it this way instead (using one of the new HTML5 tags): pheader summaryThis is just an example, it's a replacement for the old meta description, and is a brief summary (description) of the page (content)/header/p I really do not care how this is implemented/speced just as long as it's possible to do. I began thinking of this recently when it annoyed me that I basically had to enter the same content twice, after looking at my site links in Google, and thought to myself...Why do I have to use a meta description to tell Google to show the content in the first paragraph as the default summary of the page link? Why can't I simply specify that the first paragraph is the page's meta description? Why am I forced to bloat the page unnecessarily like this? Thee is no reason why the meta description can not be the actual content as in most cases I've seen the meta description is supposed to be fully human readable, unlike the meta keywords which no search engines bothers with at all any more. So if the meta description is supposed to be humanly readable and displayable as the page summary to humans in search results, why can't it also actually be in the page content? I can see at least two ways this will be used. The more elegant way I showed, where the first paragraph is a summary/the lead in of the page (and also happens to be the teaser content in my RSS feed as well), or at the bottom of a page with possibly linked category tags or similar within it, again allowing dual purpose and reduced redundancy. To re-iterate, the idea of the summary tag (or however it is implemented) should be to have a human readable summary (or teaser as may be) of a page, which is itself shown in the page, but also a replacement for search engines that use the old meta description avoiding redundancy. End result is (hopefully) less redundancy, and higher quality summary (page description) shown in search engine results, and so on. Also allowing people to quickly understand what a page is about by just reading the first paragraph (or be enticed to read more). Now if something like this allready exist/is possible I stand corrected and ask, please tell me how to do that. If not then I'd love to see something like this standardized. BTW! The text in the first paragraph of this very email could for example be the summary/description of this email. So if it was html tagged in some way, a mail indexing or search engine could use that as the summary or description view shown to a human user scrolling through archived emails. Regards, Roger. -- Roger Rescator Hågensen. Freelancer - http://EmSai.net/