i have a question for all of you xml parsers.

i have an xml file like...

        <data>
                <document>
                        <id></id>
                        <title></title>
                        <subtitle></subtitle>
                        <date></date>
                        <author></<author>
                </document>
                <document>
                ...etc...
        </data>


...that has many entries. as you can see, this xml file is acting like a database of documents. the information in the xml is what i will call the "title" information for the document. my question is... what is the best way to put in the "body" of the document?


the reason that i ask is because, putting it all in a tag, like this...

<data>
<document>
<id></id>
<title></title>
<subtitle></subtitle>
<date></date>
<author></<author>
<body>Since every penny I earn depends on copyright protection, I'm all in favor of reasonable laws to
do the job.


                                        But there's something kind of sad about the
                                        recording industry's indecent passion to punish
                                        the "criminals" who are violating their rights.

                                        Copyright is a temporary monopoly granted by
                                        the government -- it creates the legal fiction 
that a
                                        piece of writing or composing (or, as 
technologies
                                        were created, a recorded performance) is
                                        property and can only be sold by those who have
                                        been licensed to do so by the copyright holder.
                                </body>
                </document>
                <document>
                ...etc...
        </data>


...doesn't seem to make much sense, because...


1. it is so much bigger than the other content

        2.      you'd have to put a <br> in there for it to display correctly...
                or perhaps a "\n" would work... or something...   ?

3. if i want to parse the entire directory... it'll get really big if the
"body" tags contain more than a sentence or two



How should i approach this? The best thing that I have come up with is to use the info in the "id" tag to reference a text file by the same name ($id . ".txt"). So, the xml file would then a list of the documents "title" info while the "body" data is in separate text files.


Although everything isn't in the same "database," it seems like a good idea because I want to be able to use the data in the xml file for both displaying one document at a time and also displaying a listing of all of the "titles."

If all of the data was the same size (ie: one sentence or less), this would be easy, but, hey... it's never easy. For those of you that have done an XML parsing project like this before, with data that requires a hard return within data that is within a tag...

how have you done it?

does xml have a special character for end-of-lines like this?

does this approach (separate txt files for the "body") sound good?


eager for you input,


wade



____________________
BYU Unix Users Group http://uug.byu.edu/ ___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list

Reply via email to