school  

Re: Fwd: RTF -> SGML (was Re: [LAP] Application)

Raju Mathur
Thu, 26 Sep 2002 21:18:24 -0700

>>>>> "CVR" == Radhakrishnan CV <[EMAIL PROTECTED]> writes:

>>>>> "Vipul" == Vipul Mathur <[EMAIL PROTECTED]> writes:
    Vipul> On Thu, Sep 26, 2002 at 08:12:12PM +0530, Raju Mathur
    Vipul> wrote:

    >>> Why don't we just use an existing content-management
    >>> application with some enhancements (RTF -> SGML conversion)
    >>> for the Freed site for the time being?

    CVR> Raju: why should RTF be considered for the document
    CVR> originating standard if the final archival format is SGML?
    CVR> Because, both the formats follow different philosophies in
    CVR> their approach to a document -- the former being a
    CVR> presentation oriented and the latter structure oriented.

The idea was to make it as easy as possible for students and teachers
to generate content.  We don't want them to start learning new
technology -- that'd be counter-productive to the overall effort.
Hence it's up to us to provide them with a solution that they can
start using from the moment the site is up.

Open Office was another option, since it generates XML directly, but
I'm not too sure about how many people would actually use OO even if
it were provided free to them.

    CVR> [...]

    Vipul> Maybe we could do a RTF/DOC to LaTeX conversion before
    Vipul> going to SGML, might get quite ugly though!

    CVR> RTF==>LaTeX==>SGML will have more work than you
    CVR> imagine. However, if you succeed to save RTF as HTML, apply a
    CVR> decently written XSLT style sheet to translate to SGML
    CVR> conformant to your DTD, although, this doesn't solve the
    CVR> problem of well structuredness for reasons you list below.

That sounds more do-able: MS Word & co can export to HTML and if it's
possible to convert to SGML/XML then our problems are over.

    Vipul> I also found that most websites lay stress on the fact that
    Vipul> the originating document must be "well-structured" with
    Vipul> consistent heading styles to enable any decent
    Vipul> conversion. I don't know how one can ensure that, with the
    Vipul> normal tendency of MS Word users (?!) to mark out headings
    Vipul> and sub-headings by applying hard formatting.

    CVR> Most of our authors dont know what a structured document
    CVR> means. And wordprocessor users pay scant attention to well
    CVR> structured document, although the document universe is moving
    CVR> towards this paradigm faster than we imagine. It is not far
    CVR> off when the publishers are going to demand authors
    CVR> especially in the academia to provide inputs in SGML or XML
    CVR> format if they want to get their documents published.

That at least is not an issue: we can enforce styles.  And the content
developers don't have to learn anything new to start using styles in
their documents.

    CVR> [snip]

    CVR> For your authors who use Windows, TUGIndia can provide
    CVR> TeXLive CDROM (you might burn and distribute as you
    CVR> like). TeXLive is a ready to run TeX system for Linux, Win32
    CVR> and OS X in a single CD, which if you dont want to install,
    CVR> can be run from the CD itself without installing unto your
    CVR> harddrive. And a decent tutorial is provided free of cost at
    CVR> TUGIndia site.

I have the CD with me now, anyone who wants a copy is welcome to pick
it up.

Regards,

-- Raju

    CVR> -- Radhakrishnan


-- 
Raju Mathur               [EMAIL PROTECTED]      http://kandalaya.org/
                      It is the mind that moves