I have an existing archive of discussion list e-mails converted to html 
using MHonArc.
Verity chokes on them, I think because they contain html comments.  I did 
take one message (html file) and delete the comments and then it was able 
to be indexed and searchable.

Is there any configuration change to Verity to avoid this, or do I have to 
process all the files to remove them?  They are the meta-information 
identifying information from the messages converted, e.g., the headers, and 
occur throughout the documents.

<!--X-Subject: DM: DCNet'00: Call for papers -->
<!--X-From: "Simeon J. Simoff" <[EMAIL PROTECTED]> -->
<!--X-Date: Mon, 19 Jun 2000 23:23:04 +1000 -->
<!--X-Message-Id: [EMAIL PROTECTED] -->
<!--X-ContentType: text/plain -->
<!--X-Head-End-->

Thanks!

Dorothy
Dorothy Firsching
CEO
Nautilus Systems, Inc.
3867 Alder Woods Court
Fairfax, VA  22033
http://www.nautilus-systems.com/
[EMAIL PROTECTED]


------------------------------------------------------------------------------
Archives: http://www.mail-archive.com/[email protected]/
To Unsubscribe visit 
http://www.houseoffusion.com/index.cfm?sidebar=lists&body=lists/cf_talk or send a 
message to [EMAIL PROTECTED] with 'unsubscribe' in the body.

Reply via email to