At 9:05 AM +0100 3/21/01, Michael Schulz wrote:
>is it possible to index XML documents with htdig?

Well, there's nothing stopping you. At the moment, unless you have a 
specified external parser/converter, documents of text/xml will be 
indexed as plaintext. Not the greatest, but certainly not bad.

But this isn't what most people mean when they say "indexing XML." ;-)

Certainly you can easily work up some sort of parser or converter for 
given types of XML documents. But the 3.1 code has no context for 
restricting searches based on context. So if you want to search the 
<author></author> field, you're pretty much out of luck.

The 3.2 code can specify flags for each word that's indexed, so you 
can easily have nested contexts in your XML documents. However, the 
htsearch code doesn't implement field-restriction yet and while many 
flags are left unused for user manipulation, we don't have a method 
to do this on indexing yet either.

Is that a thorough enough answer?

-- 
--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to