Jochen, Aaron, and all sql gurus: I've done this (tiny, trivial, minor) change and have pushed it onto svn-trunk. Dbmail now does the body search directly in the database using HAVING SQL_INSENSITIVE_LIKE '%somestring%' against the messageblks.
A massive improvement of body searches both in terms of performance and compliance, I believe. shoot me. I need a break. enjoy. Paul J Stevens wrote: > Jochen, > > Just to keep you up to date though. I've been playing a little with the > non-indexed HAVING queries. Looks like a massive improvement over the > current situation (surpise! :-\ not!) > > some thing like > > select m.message_idnr,k.messageblk > from dbmail_messageblks k > join dbmail_physmessage p on k.physmessage_id = p.id > join dbmail_messages m on p.id = m.physmessage_id > join dbmail_mailboxes b on m.mailbox_idnr = b.mailbox_idnr > join dbmail_users u on b.owner_idnr = u.user_idnr > where u.userid='pablo' and b.name = 'list/dbmail' > having k.messageblk like '%Kündigung%' > > where of course the query will be tweaked to match the engine. This > example is for mysql > > Major speedup, very simple solution. Now on to code. > > > Paul J Stevens wrote: >> Jochen Schroer wrote: >>> Hi Paul, >>> >>> this multiple cenversations decrease the oerformance of body search >>> again, because at the meoment the search process has to load every >>> messagepart of every message the performance is not really good. >>> I'm thinking a little bit about a universal solution, but I have now >>> idea up to now. >>> (for me with postgresql I can add a new messagetext table, fill them >>> with a trigger an d search with simple selects on this table, but that >>> is no universal solution for postgresql, mysql and sqlite) >>> >>> do you have think about a solution where you save the message twice if a >>> mail is inserted into the system? >>> One time in the existing format and additional in another table with >>> converted (UTF8?) content that you can use for search? >> >> I've designed a solution to address this problem. Check the wiki: >> >> http://www.dbmail.org/dokuwiki/doku.php?id=bodysearch >> >> Basically, you can't use indexed searches on blobs. So the idea is to >> store only the /words/ from each attachment in separate records. Once >> that is done, those records /can/ be searched. Even non-indexed searches >> directly on the database (i.e. using HAVING) would probably be an >> improvement over the current situation. >> >> If you have a better idea, please share. >> > > -- ________________________________________________________________ Paul Stevens paul at nfg.nl NET FACILITIES GROUP GPG/PGP: 1024D/11F8CD31 The Netherlands________________________________http://www.nfg.nl
