Hi,
today I was thinking about improvements for fulltext search (special
body of article) which I want to discuss here. :)
Current situation / Use case:
o OTRS 2.x
o Intel Core2 2.40GHz
A fulltext search on body with "smith" takes about 15 sek on large
installations (done by normal LIKE search).
Has somebody ideas how to improve this?
I do have 3 ideas.
a) Striping/Unique Content
--------------------------
The idea is quite easy. Just creating a mirror article table (e. g.
article_search) and store all bodies of the article table with unique
words of one article in the article_search table. So OTRS can use this
table 1:1 for fulltext searches.
b) Word-Table/Ref-Word-Article-Table
------------------------------------
This idea is to put all used words in article body into a search_word
(id,word) table and put the relations what words are used in the
article body into a search_word_article (word_id,article_id).
So OTRS can use this table for fulltext searches.
c) Create Index of frequent search words
----------------------------------------
This idea is to put all frequently searched words in an extra index.
And do not frequently search words in real time.
Result with about 12.000 tickets and 50.000 article:
| article |a) article | b)search_words |
c)dedicated
| (default) | mirror uniq | and ref table |
search_word
| | words | | index
------------+--------------+---------------+----------------
+---------------
Storage Size| 60M | 16M | 23M | 8M/
2000words
------------+--------------+---------------+----------------
+---------------
Search Time | 4s | 2s | 1s |
Index 1s
| | | | No
Index 4s
------------+--------------+---------------+----------------
+---------------
Time to | 0 min | 30 min | 2.5 h | 1 h
create Index| | | |
------------+--------------+---------------+----------------
+---------------
Result with about 70.000 tickets and 240.000 article:
| article |a) article | b)search_words |
c)dedicated
| (default) | mirror uniq | and ref table |
search_words
| | words | |
------------+--------------+---------------+----------------
+---------------
Storage Size| 440M | 121M | 394M | 50M/
2000words
------------+--------------+---------------+----------------
+---------------
Search Time | 17s | 5.8s | 48s | Index
3.0s
| | | | No
Index 17s
------------+--------------+---------------+----------------
+---------------
Time to | 0 min | 1 h | 9 h | 4 h
create Index| | | |
------------+--------------+---------------+----------------
+---------------
So in my opinion for small setups b) would work very fine but would
not scale to lager installations.
c) looks good for large installations till not indexed words are
searched (e. g. searches for CVE oder other numbers would always use
no indexed search). a) looks robust to both, large and small
installations.
Any other ideas or questions?
-Martin
--
((otrs)) :: OTRS AG :: Europaring 4 :: D - 94315 Straubing
Fon: +49 (0) 9421 56818 0 :: Fax: +49 (0) 9421 56818 18
http://www.otrs.com/ :: Communication with success!
Address of record: Bad Homburg
Local Court: Bad Homburg, HRB 10751
Tax number: 003/240/97505
Chairman: Burchard Steinbild
Chief Executive Officer: André Mindermann
_______________________________________________
OTRS mailing list: dev - Webpage: http://otrs.org/
Archive: http://lists.otrs.org/pipermail/dev
To unsubscribe: http://lists.otrs.org/cgi-bin/listinfo/dev