Hi Paul,
the idea looks good, but postgresql is a little bit more fussy then
mysql, so your SQL failed.
1. for your select postgresql needs a grouping
2. at the moment there is no case insensitive operator for bytea
datatypes (but I'm searching for a workaround)
this statement work (case-sensitive):
SELECT m.message_idnr,k.messageblk FROM dbmail_messageblks k JOIN
dbmail_physmessage p ON k.physmessage_id = p.id JOIN dbmail_messages m
ON p.id = m.physmessage_id WHERE mailbox_idnr = '1' AND status IN
('0','1' ) AND k.is_header = '0' group by m.message_idnr,k.messageblk
HAVING k.messageblk like '%Kündigung%'
The speedup is very good!
Kind regards,
Jochen
> Jochen, Aaron, and all sql gurus:
>
> I've done this (tiny, trivial, minor) change and have pushed it onto
> svn-trunk. Dbmail now does the body search directly in the database
> using HAVING SQL_INSENSITIVE_LIKE '%somestring%' against the messageblks.
>
> A massive improvement of body searches both in terms of performance and
> compliance, I believe.
>
>
>
> shoot me. I need a break.
>
> enjoy.
>
>
>
> Paul J Stevens wrote:
>
>> Jochen,
>>
>> Just to keep you up to date though. I've been playing a little with the
>> non-indexed HAVING queries. Looks like a massive improvement over the
>> current situation (surpise! :-\ not!)
>>
>> some thing like
>>
>> select m.message_idnr,k.messageblk
>> from dbmail_messageblks k
>> join dbmail_physmessage p on k.physmessage_id = p.id
>> join dbmail_messages m on p.id = m.physmessage_id
>> join dbmail_mailboxes b on m.mailbox_idnr = b.mailbox_idnr
>> join dbmail_users u on b.owner_idnr = u.user_idnr
>> where u.userid='pablo' and b.name = 'list/dbmail'
>> having k.messageblk like '%Kündigung%'
>>
>> where of course the query will be tweaked to match the engine. This
>> example is for mysql
>>
>> Major speedup, very simple solution. Now on to code.
>>
>>
>> Paul J Stevens wrote:
>>
>>> Jochen Schroer wrote:
>>>
>>>> Hi Paul,
>>>>
>>>> this multiple cenversations decrease the oerformance of body search
>>>> again, because at the meoment the search process has to load every
>>>> messagepart of every message the performance is not really good.
>>>> I'm thinking a little bit about a universal solution, but I have now
>>>> idea up to now.
>>>> (for me with postgresql I can add a new messagetext table, fill them
>>>> with a trigger an d search with simple selects on this table, but that
>>>> is no universal solution for postgresql, mysql and sqlite)
>>>>
>>>> do you have think about a solution where you save the message twice if a
>>>> mail is inserted into the system?
>>>> One time in the existing format and additional in another table with
>>>> converted (UTF8?) content that you can use for search?
>>>>
>>> I've designed a solution to address this problem. Check the wiki:
>>>
>>> http://www.dbmail.org/dokuwiki/doku.php?id=bodysearch
>>>
>>> Basically, you can't use indexed searches on blobs. So the idea is to
>>> store only the /words/ from each attachment in separate records. Once
>>> that is done, those records /can/ be searched. Even non-indexed searches
>>> directly on the database (i.e. using HAVING) would probably be an
>>> improvement over the current situation.
>>>
>>> If you have a better idea, please share.
>>>
>>>
>>
>
>
>
--
ECS-Solution GmbH
Raiffeisenstr. 26
67817 Imsbach
Germany
Web: http://www.ecs-solution.de
Tel: +49 (6302) 609 71-0
Fax: +49 (6302) 609 71-1