> I like the idea of finding an existing Document Object Model to use. We > should all look around and see whats out there. My number one criteria for > our DOM is intuitiveness. I would like to find one that fits our needs and > is intuitive.
Intuitiveness is important. We want users to be productive without too much unnecessary effort. I think the right level of abstraction gets you 90% of it. Anything so the application developer can remain ignorant of all that gobbledygook that you and Praveen exchange: "In the explanation given for 'sprmPlncLvl' it says The sprm is three bytes long and consists of the sprm code and a one byte two's complement value." Ouch! If users can remain as ignorant as I am of what that means, this project will be a success! Obviously, using standards, official or de facto is also a good thing. Existing things out there in this domain (rich text models) include: 1) HTML/XHTML DOM 2) XSL:FOP 3) OpenOffice.org / OASIS Open Office XML 4) XML using some other vocabulary Any others? You take each one of these and weigh it against a few criteria: 1) Does it allow a clean separation of content and style? Presumably Word is big on that and we don't want to loose that. 2) Is it expressive enough to represent the breadth of Word functionality that is important to us? 3) Is it easy to work with, lend itself to tooling, etc.? 4) Is it popular, widely adopted, etc., such that you might get some synergy with other projects? 5) Does it lend itself to a high-performance implementation? 6) Does it make the simple stuff simple while at the same time allowing more ambitious users to do more ambitious things? HTML by itself fails 1) and 2). Adding CSS stylesheets could remedy that, but it would still lack page-level features, like headers/footers, page numbers, or even hard or soft page breaks. FOP gives a lot more support, though it is rather complex. OpenOffice.org mixes FOP with several other standard markups like MathML, SVG and XLink. But it gets complex pretty quickly -- A simple "hello world" document generates XML with the following namespaces: xmlns:office="http://openoffice.org/2000/office" xmlns:style="http://openoffice.org/2000/style" xmlns:text="http://openoffice.org/2000/text" xmlns:table="http://openoffice.org/2000/table" xmlns:draw="http://openoffice.org/2000/drawing" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:number="http://openoffice.org/2000/datastyle" xmlns:svg="http://www.w3.org/2000/svg" xmlns:chart="http://openoffice.org/2000/chart" xmlns:dr3d="http://openoffice.org/2000/dr3d" xmlns:math="http://www.w3.org/1998/Math/MathML" xmlns:form="http://openoffice.org/2000/form" xmlns:script="http://openoffice.org/2000/script" But it is something of a moving target now that OASIS is drafting a format standard based on it. But I think it will be an attractive and widely used format once it re-emerges as a standard. I'm afraid I've raised more questions than I've answered ;-) But in the end I really don't see anything out there that jumps out and says "I'm the API you want". -Rob
