Good discussion though ;) Helps to resurrect the XHTML vs HTML argument every year or so and ol Mark has battled in these trenches many a time and has a lot to offer on the pro's and con's space :)
(I think i still owe him a beer for one debate i lost in this arena).. On 12/7/06, Tom Kerr <[EMAIL PROTECTED]> wrote: > > > On Thu, Dec 07, 2006 at 01:59:33PM +1000, Scott Barnes wrote: > > On 12/7/06, Tom Kerr <[EMAIL PROTECTED]> wrote: > > > On Thu, Dec 07, 2006 at 11:25:38AM +1000, Scott Barnes wrote: > > > > On 12/6/06, Ryan Sabir <[EMAIL PROTECTED]> wrote: > > > > > How many of you are developing sites in XHTML these days? Is it > > > > > worth the extra effort? > > > > > > > > SOE is supposedly the ducks nuts as to why. Yet, you'd have to be a > > > > moron to expect Google to differentiate between XHTML vs HTML as in > > > > the end, content is the one commodity google and co want initially. > > > > > > > > I've read many a debate on it, but in the end the browsers are smart > > > > enough and will continue to evolve to the fact that tag prediction > and > > > > differentiating between Style vs Semantically Correct tagging has > > > > probably become a moot point these days and usually reserved for the > > > > HTML purists out there. > > > > > > I've not yet read an informed point of view that argued that Google > And > > > Friends *bias* their scoring systems towards XHTML, or even valid > HTML. > > > If you've got a link, I'd appreciate the chuckle. I think there's > > > little doubt though that they would like to extract all possible > content > > > from whatever document you publish and classify it as best they can. > > > The argument tends to be more along the lines that an automatic > process > > > is *better able* to extract and classify content from valid, > well-formed > > > HTML that follows a known set of rules. XHTML is better yet again > > > because of the increased signal-to-noise ratio. Semantically correct > > > markup simply conveys more information about the document contents. > > > > Agree and Disagree and can't quite seperate the two.. i'm lost for words > and > > i'm sure its the first time to! ;) > > > > it comes back to the heart of it all, content is Search Engines > lifeblood, > > without it, they die. Google would die of a horrible death tommorow if > the > > "browser" was shot in the back of the head, Apollo and WPF were to rule > the > > world and life as we know it would change. > > > > So they and other search engines really can't sit back and enforce the > > ruling, its not in their interest to do so and the only way they could > argue > > to bring balance back to the force is to adapt to another solution, > RSS/ATOM > > etc.. > > > > As atleast its reasonable to say "thou must abide by the validation" > rules, > > which in a summary could allow them to extract subtance from noise? > probably > > why blogs in the google space can be an annoyance a times to the ranking > > structure - which they appear to have overcome. > > > > XHTML vs HTML... its moot point and i realistically can't see any > company > > applying penalities either way for the choice as it would carve out a > large > > percentage of the commodity they love the most. > > This is probably the key point I was trying to make, and at great risk > of belabouring it, I'm not arguing that any engine either *does* > or *should* penalise one format over the other (nor penalise semantic vs. > non-semantic markup, which is a point Mark made in his followup, and a > distinction I obviously blurred a little too far in my post). What I am > arguing is that more meta-information can be extracted from, at one > extreme, semantic, valid XHTML than, at the other extreme, HTML-2-ish > tag soup. > > What I do argue is that if information about the document structure is > available in the semantic markup, and it can be used to increase the > relevance of search results by the system, search engine providers can, > should and probably do make use of that extra information. It's not an > explicit penalty as such, but if you're supplying less information about > your document, a natural consequence is that it's less searchable. > Content is a search engine's lifeblood, and it behooves the provider to > extract every last drop of information about the content that is > available. > > -T > > > > -- Regards, Scott Barnes http://www.mossyblog.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "cfaussie" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/cfaussie?hl=en -~----------~----~----~----~------~----~------~--~---
