GP, You might want to check out the RDF tagset. An example can be found at http://slashdot.org/slashdot.rdf. In any case, if you wish to make the news story's body and/or title searchable you might want to look into using Verity. You can index XML and then get the content of the doc during a search and give it to an XML parser (i.e. - soxml) to get the info you need. To get the info you need you can use Xpath to query the doc from there. If you are looking to use an XML parser other than soXML, you might want to look into building your own custom parser via MSXML or a Java API (JAXP, Xerces, JDOM, etc.). Hope this helps.
-Brian LeGros -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of gurjit pawar Sent: Monday, February 04, 2002 12:32 PM To: '[EMAIL PROTECTED]' Subject: [cf-xml] how can i do this I have a directory full of XML format news stories. The names of these files contain the date and time stamp of each story. The entire batch of files are held in a stories folder. A link list will be downloaded from a server every 15 minutes, which will contain 5-10 links to these news stories, referenced via the file name which is either in the stories directory or has been added at the point at which the link list has been downloaded. What would be the best way of abstracting the data in the news stories, so that a search page can be developed, to enable users to search current and previous stories? Also what is the best way to query these types of XML files, since the file names change regularly? Is there a predefined format that people are using out there to query XML? In the past I have used the SOXML tag to query XML. I find this method pretty long winded, especially when there are hundreds of files to render. Are their other methodologies that are ideal for this type of situation? Thank You Regards GP -----------------------+ cf-xml mailing list list: [EMAIL PROTECTED] admin: [EMAIL PROTECTED] home: http://torchbox.com/xml -----------------------+ cf-xml mailing list list: [EMAIL PROTECTED] admin: [EMAIL PROTECTED] home: http://torchbox.com/xml
