I thoroughly understand your point, but disagree in that the XML is of much
more value than the HTML, even though it embeds presentation.

Our Content Management system origionlly had a editor called the XML Editor,
which you can still see a screen shot of in the tour:
https://pubc.metaverse.cc/app/PubcTour.aspx?index=10
which could capture more granular details into the data.  However, even
though you and I both know that the XML resulting from an editor like this,
or similiar ones I've seen, produce much more useful XML data, the simple
fact is that people don't like using those editors.  Even though we had a
large investment in the XML Editor, we ultimately discontinuing it, in favor
of Microsoft Word integration, because the fact is that the business owners
who are the ones using the Content Management systems often aren't technical
enough to really find their way around in Word, no less a more complicated
editor with XML tags.

For enterprise implementations, when clients have 10,000 - 30,000 and
sometimes more documents, many of which are in Microsoft Word format, or a
format that can easily be converted to RTF, there is no easy way to convert
all of that data to the kind of granular XML that is more descriptive and
better to query by.  There's just too much data, to with any reasonable or
even generous budget to intelligently convert into XML, that's a project in
and of iteself.

The XML generated from our XFormWebService is much more useful than HTML in
that it can be transformed with XSLT into any other format.  The
presentation that is embedded is only a "recommendation" and does not
mandate that the end transformation honor that presentation.  The example I
showed here used an HTML style-sheet, but I could have applied a wireless
style-sheet or any other format.

Additionally, in Word you can create your own style's.  (Heading 1, 2, 3,
etc. come as default styles).  When you create your own, say an "Article
Title" style, that would be transformed as <ArticleTitle> instead of
<Heading>, so you can gain intelligence from that and still allow people to
author in Word.

An example of the process a Word document goes through, from Word to XML to
HTML can be visualized here:
http://www.metaverse.cc/main.asp?nav=TECHNAV&group=TECHNOLOGY&page=XML%20Wal
k-Through%20-%20Documents

Regards,
Doug Kerwin
http://www.metaverse.cc

----- Original Message -----
From: "Iva Koberg" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, December 04, 2002 3:39 PM
Subject: RE: [cms-list] WYSIWYG Editor suggestions


> <snip>
> <?xml version="1.0"?>
> <Doc>
>  <Title><Bold><Font RelSize="+2">My Heading</Font></Bold></Title>
>  <Para LineBreak="no" Align="left" Empty="Y"></Para>
>  <Para LineBreak="no" Align="left">This is my test paragraph.  How about
> <Underline>underline</Underline>, <Italics>italics</Italics>, and
> <Bold>bold</Bold>?</Para>
>  <Para LineBreak="no" Align="left" Empty="Y"></Para>
>  <Para LineBreak="no" Align="left">This is a bulleted list</Para>
>  <List>
>   <Item>My first bullet</Item>
>   <Item>And the second bullet</Item>
>   <Item>Last bullet</Item>
>  </List>
> </Doc>
> </snip>
>
> The above XML is not more useful than good old HTML - it mixes content
> and presentation, it provides no meaningful description of the content!
> What does <bold> mean? How can this content be reused in the future? How
> can a semantic query determine what is in <italics>? Is <italics>
> telling me I'd find a citation inside, a reference, an author's name?
> You are not semantically describing the content, you're repackaging HTML
> into different tag names. You're not managing your content any better.
>
>
>
> <snip>
> <span class="content-heading">My Heading</span>
> <br clear="all">
> <p class="content-text" align="left">&nbsp;</p>
> </snip>
>
> By the way, this HTML and CSS is quite invalid (very much like what Word
> produces ;)
>
>
> <snip>
> There's no sense in trying to beat Word as an authoring tool.  I've seen
> a
> lot of WYSIWYG authoring tools, and they all fall short.  Why go through
> the
> trouble
> </snip>
>
> A pencil and paper is a very widely used content authoring tool as well,
> but the problem is that content authored that way can't be reused. Same
> goes for Word - companies are spending millions to try and salvage
> information out of Word documents. And what are they finding out? That
> it can't be reliably done programmatically because the goal is to
> transform to meaningful semantic markup and Word does not mark up the
> content semantically. That makes it very worth the trouble to create a
> better tool for authoring content IMO :)
>
> best,
> Iva
>
>
> --
> http://cms-list.org/
> trim your replies for good karma.
>

--
http://cms-list.org/
trim your replies for good karma.

Reply via email to