Good discussion though ;)

Helps to resurrect the XHTML vs HTML argument every year or so and ol Mark
has battled in these trenches many a time and has a lot to offer on the
pro's and con's space :)

(I think i still owe him a beer for one debate i lost in this arena)..

On 12/7/06, Tom Kerr <[EMAIL PROTECTED]> wrote:
>
>
> On Thu, Dec 07, 2006 at 01:59:33PM +1000, Scott Barnes wrote:
> > On 12/7/06, Tom Kerr <[EMAIL PROTECTED]> wrote:
> > > On Thu, Dec 07, 2006 at 11:25:38AM +1000, Scott Barnes wrote:
> > > > On 12/6/06, Ryan Sabir <[EMAIL PROTECTED]> wrote:
> > > > > How many of you are developing sites in XHTML these days? Is it
> > > > > worth the extra effort?
> > > >
> > > > SOE is supposedly the ducks nuts as to why. Yet, you'd have to be a
> > > > moron to expect Google to differentiate between XHTML vs HTML as in
> > > > the end, content is the one commodity google and co want initially.
> > > >
> > > > I've read many a debate on it, but in the end the browsers are smart
> > > > enough and will continue to evolve to the fact that tag prediction
> and
> > > > differentiating between Style vs Semantically Correct tagging has
> > > > probably become a moot point these days and usually reserved for the
> > > > HTML purists out there.
> > >
> > > I've not yet read an informed point of view that argued that Google
> And
> > > Friends *bias* their scoring systems towards XHTML, or even valid
> HTML.
> > > If you've got a link, I'd appreciate the chuckle.  I think there's
> > > little doubt though that they would like to extract all possible
> content
> > > from whatever document you publish and classify it as best they can.
> > > The argument tends to be more along the lines that an automatic
> process
> > > is *better able* to extract and classify content from valid,
> well-formed
> > > HTML that follows a known set of rules.  XHTML is better yet again
> > > because of the increased signal-to-noise ratio.  Semantically correct
> > > markup simply conveys more information about the document contents.
> >
> > Agree and Disagree and can't quite seperate the two.. i'm lost for words
> and
> > i'm sure its the first time to! ;)
> >
> > it comes back to the heart of it all, content is Search Engines
> lifeblood,
> > without it, they die. Google would die of a horrible death tommorow if
> the
> > "browser" was shot in the back of the head, Apollo and WPF were to rule
> the
> > world and life as we know it would change.
> >
> > So they and other search engines really can't sit back and enforce the
> > ruling, its not in their interest to do so and the only way they could
> argue
> > to bring balance back to the force is to adapt to another solution,
> RSS/ATOM
> > etc..
> >
> > As atleast its reasonable to say "thou must abide by the validation"
> rules,
> > which in a summary could allow them to extract subtance from noise?
> probably
> > why blogs in the google space can be an annoyance a times to the ranking
> > structure - which they appear to have overcome.
> >
> > XHTML vs HTML... its moot point and i realistically can't see any
> company
> > applying penalities either way for the choice as it would carve out a
> large
> > percentage of the commodity they love the most.
>
> This is probably the key point I was trying to make, and at great risk
> of belabouring it, I'm not arguing that any engine either *does*
> or *should* penalise one format over the other (nor penalise semantic vs.
> non-semantic markup, which is a point Mark made in his followup, and a
> distinction I obviously blurred a little too far in my post).  What I am
> arguing is that more meta-information can be extracted from, at one
> extreme, semantic, valid XHTML than, at the other extreme, HTML-2-ish
> tag soup.
>
> What I do argue is that if information about the document structure is
> available in the semantic markup, and it can be used to increase the
> relevance of search results by the system, search engine providers can,
> should and probably do make use of that extra information.  It's not an
> explicit penalty as such, but if you're supplying less information about
> your document, a natural consequence is that it's less searchable.
> Content is a search engine's lifeblood, and it behooves the provider to
> extract every last drop of information about the content that is
> available.
>
> -T
>
> >
>


-- 
Regards,
Scott Barnes
http://www.mossyblog.com


--~--~---------~--~----~------------~-------~--~----~
 You received this message because you are subscribed to the Google Groups 
"cfaussie" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/cfaussie?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to