On July 2, 2002 at 09:10, Ben Ocean wrote: > >I do not know if LDAP would be efficient for this. If you want > >to do fulltext searching, I would not recommend LDAP. What kind > >of searching would you like to do? > > I presume it's called full text searching. The *standard* kind of searching > one does on any search for discussion lists such as for python.org, > zope.org, etc.
I wanted to be sure since some types of searching could be appropriate for LDAP. For example, you could store mail header information in LDAP to provide queries for items like, "give me all messages from a given author." > I thought LDAP would be appropriate here because the data > doesn't change. But LDAP is not really designed to do full text searching. LDAP's roots come from X.500 which is basically a standard for providing distributed directory services (address, organizations, etc). The directory service is not intended to support transactions or frequent modifications (but later X.500/LDAP implementations probably handle data modification fairly efficiently). Read-only-based queries is where X.500/LDAP is supposed to be very efficient and optimized for. > Are you saying MySQL is more appropriate? It could be, and some users have requested they would like such a thing. However, when it comes to full text retrieval, traditional RDBMS are not as efficient as full text search engines. Reason in a nutshell: full text search engines index the data into structures (like hashes) to provide fast query results while RDBMS is basically doing a fancy grep wrt large text columns (which would be needed to store message body text). Companies like Oracle do provide some fill text indexing add-ons to their RDBMS, but I hear it takes some work to configure and may not be that mature. If you have a lot of computing resources, you could dump everything into a database and it can do all your searches. But it will not scale well and will definitely not give you the performance of full text search engines. You would also have to determine what you want to do with attachments (probably store file references to them instead of as blobs in the database), and since the text data of messsage bodies can be large, this could impact how you design your schema and overall database performance. Where RDBMS, or LDAP, can be very useful is in meta-based searches. For example, storing message header information like mentioned above to allow useful meta-based searches and dynamic archive navigation capabilites beyond the static ones provided by MHonArc. In newer versions of MHonArc, a minimal Perl API exists to allow something like this. The API is documented in an appendix section of the documentation. In a nutshell, you can create a callback function to take the message header data obtained from MHonArc, and store that information into a RDBMS using the Perl DBI modules, or if you like LDAP, you can use the Perl LDAP modules. --ewh --------------------------------------------------------------------- To sign-off this list, send email to [EMAIL PROTECTED] with the message text UNSUBSCRIBE MHONARC-USERS
