GP,

You might want to check out the RDF tagset.  An example can be found at
http://slashdot.org/slashdot.rdf.  In any case, if you wish to make the
news story's body and/or title searchable you might want to look into
using Verity.  You can index XML and then get the content of the doc
during a search and give it to an XML parser (i.e. - soxml) to get the
info you need.  To get the info you need you can use Xpath to query the
doc from there.  If you are looking to use an XML parser other than
soXML, you might want to look into building your own custom parser via
MSXML or a Java API (JAXP, Xerces, JDOM, etc.).  Hope this helps.

-Brian LeGros

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]] On Behalf Of gurjit pawar
Sent: Monday, February 04, 2002 12:32 PM
To: '[EMAIL PROTECTED]'
Subject: [cf-xml] how can i do this


I have a directory full of XML format news stories. The names of these
files
contain the date and time stamp of each story. The entire batch of files
are
held in a stories folder.

A link list will be downloaded from a server every 15 minutes, which
will
contain 5-10 links to these news stories, referenced via the file name
which
is either in the stories directory or has been added at the point at
which
the link list has been downloaded.

What would be the best way of abstracting the data in the news stories,
so
that a search page can be developed, to enable users to search current
and
previous stories? Also what is the best way to query these types of XML
files, since the file names change regularly?

Is there a predefined format that people are using out there to query
XML?

In the past I have used the SOXML tag to query XML. I find this method
pretty long winded, especially when there are hundreds of files to
render.
Are their other methodologies that are ideal for this type of situation?



Thank You
Regards
GP

-----------------------+
cf-xml mailing list
list: [EMAIL PROTECTED]
admin: [EMAIL PROTECTED]
home: http://torchbox.com/xml


-----------------------+
cf-xml mailing list
list: [EMAIL PROTECTED]
admin: [EMAIL PROTECTED]
home: http://torchbox.com/xml

Reply via email to