Re: Structure/Schema - Custom or off the shelf?

Marcus Carr Fri, 10 Feb 2006 07:58:18 -0800

Alan Houser wrote:

I've enjoyed our exchange. The contrast between Micheal's and Eliot'sopinions is fascinating, and insightful. Eliot has a long-standingreputation in the markup languages community, while Michael's reputationis solid as a designer of DITA and much of the underlying XSLTprocessing required to implement the DITA architecture.

I've enjoyed it as well. Although I didn't mean to start dueling-URLs,it's interesting to see that the same discussions take place everywhereand by everyone. I guess the main thing that we can conclude from thisis that there is no single right answer - if there was, we'd be morelikely to be talking about gardening strategies than the adaptation ofdata structures.

Yet they disagree. To add yet another opinion to the mix, Tim Bray, aco-author of the XML recommendation, warns of the requisite effort andrisks in designing any new substantial markup vocabulary, and advisesreaders to begin by evaluating the capabilities of the "big five" provenXML vocabularies (I would add DITA to his list).
http://www.tbray.org/ongoing/When/200x/2006/01/08/No-New-XML-Languages .

An interesting article, but I think he misses the point by the thirdparagraph.

Designing XML Languages is hard. It’s boring, political,
time-consuming, unglamorous, irritating work. It always takes longer
than you think it will, and when you’re finished, there’s always this
feeling that you could have done more or should have done less or got
some detail essentially wrong.

Adopting an existing language (for want of a better term) doesn't meanthat you don't have to do the analysis on your dataset. Analysis *is*hard, but you have to do it no matter what route you take, so the savingto be made by adopting rather than creating is in the schema and theapplication used to render the data. Once the analysis has beencompleted, the schema falls out quite naturally, so I tend to disregardthat as a substantial issue. So then the gains come down to theapplications used to render the data. The only problem is that often(but not always) the OTS schema needs tweaking, so then your renderingapplication does as well. Now you're to the point where you've had toget your hands dirty, so the only tangible gain is that you haven't hadto create the parts of the application that were suitable straight outof the box.


So rather than following the custom approach of:

  o  do the analysis,
  o  create the schema,
  o  create the rendering application.

you would end up with the OTS approach of:

  o  do the analysis,
  o  modify the schema,
  o  modify the rendering application.

I believe that generally, modifying the schema will take longer thancreating one from scratch if you've done your analysis well. If youhaven't done proper analysis because you intend to use the OTS schema, Idon't think it's appropriate to compare the two approaches as the customsolution would clearly be providing a superior result.

If you've had to learn enough about your rendering application to makethe appropriate changes, the savings there could be pretty marginal aswell. There is also the issue that changing something created by someoneelse for purposes that you don't require can involve a lot of fiddling.

Why does Michael advocate using DITA out-of-the-box? I can't speak forhim, but I suspect the answer lies at least partially in the size andstructure of IBM's product development teams, which resemblesmall-to-medium software companies more than tightly-integrated membersof a $150+ billion dollar enterprise.

I would have said exactly the opposite (not speaking for Michael aswell... ;-) - I think it comes down to ease of interchange betweendevelopment teams or business units or whatever they may be called. Inthe early nineties IBM developed IBM-IDDOC(ftp://ftp.ifi.uio.no/pub/SGML/IBMIDDoc/) specifically as a corporateand interchange standard - from the readme:

The IBMIDDoc DTD has been defined by IBM as the eventual source language
for all product documentation.  IBMIDDoc is intended to both support the
full range of information structures needed by IBM for its information and
to support the free interchange of SGML source documents between IBM, it's
business partners, and customers.  In this second role, IBMIDDoc only
serves if it, or DTDs derived from it, are used widely.  To further that
aim, IBM is making the DTD and supporting documentation freely available.

Of course this approach didn't work. As with all large structures, itwas too difficult to get everyone doing things the same way, so itfailed on pretty much all fronts. You could probably get somethinguseful by converting all the IBM-IDDOC files into a lowest commondenominator structure and presto! - the idea for DITA was hatched. Makeit as simple as possible and allow it to be extended - that shouldresult in better documents, because extending is an active processwhereas misusing existing elements is a passive one. I completely agreewith that logic. It does make me wonder why DocBook and DITA are sooften considered to be a "set" though - it seems to me they're split bythe custom solution.

I tend to agree with you and Eliot for XML implementations in which thebusiness requirements mandate a substantially new vocabulary, and thebudget supports the necessary development and implementation effort.However, many (especially smaller) organizations face business needsthat can be met by subsetting DocBook or using DITA as-is or nearly so.In addition, these vocabularies provide the necessary processingtoolkits for generating output. The latter can be a complex, costlyeffort that is often out-of-reach of smaller organizations who areevaluating a migration to XML-based publishing.

Perhaps our experiences are just very different, but I don't accept thatcost is always the driving factor. In some cases, I think that theallure of having something demonstrable in a short period of time swaysthe unwary into what may eventually prove to be a limiting course. Inmany cases, the cost of a custom development can be less than the OTSapproach, provided you've done the requisite learning. If you haven't...

Particularly in a forum such as Framers where there are a large numberof people coming to structure from an indirect part of the publishingindustry, I think it's important to stress that there's no such thing asa free lunch. Doing structure well is hard.

So is the elegant design of a database, but nobody in their right mindwould think that since it's just data and tech writers deal with dataall day, they're always the appropriate ones to do the database design.Nor do you find "Your Database In A Box" (YDIAB?) products that you canjust mess around with a bit until you've got what you need. Why?Properly designed databases are too diverse to have enough commoncomponents to be worth packaging for modification. Generally I feel thesame way about other types of structured data.

I don't see us coming to an agreement on this, but I think it's aninteresting and useful discussion anyway.



--
Regards,

Marcus Carr                      email:  [EMAIL PROTECTED]
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein
_______________________________________________


You are currently subscribed to Framers as [EMAIL PROTECTED]

To unsubscribe send a blank email to[EMAIL PROTECTED]

or visit 
http://lists.frameusers.com/mailman/options/framers/archive%40mail-archive.com

Send administrative questions to [EMAIL PROTECTED] Visit
http://www.frameusers.com/ for more resources and info.

Re: Structure/Schema - Custom or off the shelf?

Reply via email to