On Thu, Dec 07, 2006 at 01:59:33PM +1000, Scott Barnes wrote: > On 12/7/06, Tom Kerr <[EMAIL PROTECTED]> wrote: > > On Thu, Dec 07, 2006 at 11:25:38AM +1000, Scott Barnes wrote: > > > On 12/6/06, Ryan Sabir <[EMAIL PROTECTED]> wrote: > > > > How many of you are developing sites in XHTML these days? Is it > > > > worth the extra effort? > > > > > > SOE is supposedly the ducks nuts as to why. Yet, you'd have to be a > > > moron to expect Google to differentiate between XHTML vs HTML as in > > > the end, content is the one commodity google and co want initially. > > > > > > I've read many a debate on it, but in the end the browsers are smart > > > enough and will continue to evolve to the fact that tag prediction and > > > differentiating between Style vs Semantically Correct tagging has > > > probably become a moot point these days and usually reserved for the > > > HTML purists out there. > > > > I've not yet read an informed point of view that argued that Google And > > Friends *bias* their scoring systems towards XHTML, or even valid HTML. > > If you've got a link, I'd appreciate the chuckle. I think there's > > little doubt though that they would like to extract all possible content > > from whatever document you publish and classify it as best they can. > > The argument tends to be more along the lines that an automatic process > > is *better able* to extract and classify content from valid, well-formed > > HTML that follows a known set of rules. XHTML is better yet again > > because of the increased signal-to-noise ratio. Semantically correct > > markup simply conveys more information about the document contents. > > Agree and Disagree and can't quite seperate the two.. i'm lost for words and > i'm sure its the first time to! ;) > > it comes back to the heart of it all, content is Search Engines lifeblood, > without it, they die. Google would die of a horrible death tommorow if the > "browser" was shot in the back of the head, Apollo and WPF were to rule the > world and life as we know it would change. > > So they and other search engines really can't sit back and enforce the > ruling, its not in their interest to do so and the only way they could argue > to bring balance back to the force is to adapt to another solution, RSS/ATOM > etc.. > > As atleast its reasonable to say "thou must abide by the validation" rules, > which in a summary could allow them to extract subtance from noise? probably > why blogs in the google space can be an annoyance a times to the ranking > structure - which they appear to have overcome. > > XHTML vs HTML... its moot point and i realistically can't see any company > applying penalities either way for the choice as it would carve out a large > percentage of the commodity they love the most.
This is probably the key point I was trying to make, and at great risk of belabouring it, I'm not arguing that any engine either *does* or *should* penalise one format over the other (nor penalise semantic vs. non-semantic markup, which is a point Mark made in his followup, and a distinction I obviously blurred a little too far in my post). What I am arguing is that more meta-information can be extracted from, at one extreme, semantic, valid XHTML than, at the other extreme, HTML-2-ish tag soup. What I do argue is that if information about the document structure is available in the semantic markup, and it can be used to increase the relevance of search results by the system, search engine providers can, should and probably do make use of that extra information. It's not an explicit penalty as such, but if you're supplying less information about your document, a natural consequence is that it's less searchable. Content is a search engine's lifeblood, and it behooves the provider to extract every last drop of information about the content that is available. -T --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cfaussie" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~----------~----~----~----~------~----~------~--~---
