On Tue, 10 Feb 2004 14:31, Nick Rout wrote:
> Does anyone have any real world experience in running htdig over a large
> number of word documents.
>
> (Background  - the Police in large cases now disclose the file to
> defence counsel on cd (or 5 cd's in this case) full of ms word documents.
> No index, no analysis, just sequentially numbered files full of
> typewritten docs. If the cd's are full, thats 3G, it prints out to a
> desk full of eastlight files.)

http://www.google.co.nz/search?q=managing+gigabytes&ie=ISO-8859-1&hl=en&btnG=Google+Search&meta=

especially
http://www.mds.rmit.edu.au/mg/intro/mgintro.html
The MG (Managing Gigabytes) system is a collection of programs which comprise 
a full-text retrieval system. A full-text retrieval system allows one to 
create a database out of some given documents and then do queries upon it to 
retrieve any relevant documents. It is "full-text" in the sense that every 
word in the text is indexed and the query operates only on this index to do 
the searching. 

No first hand exp. but a friend waxed lyrical about the system.

-- 
Sincerely etc.
Christopher Sawtell

NB. This PC runs Linux. If you find a virus apparently from me,
it has forged the e-mail headers on someone else's machine.
Please do not notify me when this occurs. Thanks.

Reply via email to