On Wed, 14 Feb 2018 14:26:21 +0000 Wout Mertens <wout.mert...@gmail.com> wrote:
> I too am interested in this answer, I still have to start using fts5. > > What would be interesting is to see the `EXPLAIN QUERY PLAN [query]` for > each of your queries, so as to see what causes the slowness. > It is clear what causes the slowness. For example here is one query: select PostFTS.rowid, PostFTS.user as UserName, P.userID, U.av_time as AVer, PostFTS.slug, strftime('%d.%m.%Y %H:%M:%S', P.postTime, 'unixepoch') as PostTime, P.ReadCount, snippet(PostFTS, 0, '*', '*', '...', 16) as Content, PostFTS.Caption, (select count() from UnreadPosts UP where UP.UserID = 2 and UP.PostID = PostFTS.rowid) as Unread from PostFTS left join Posts P on P.id = PostFTS.rowid left join Users U on U.id = P.userID where PostFTS match "user: s*" order by SOME_CLAUSE limit 20; 1. With SOME_CLAUSE=rank, the execution time is between 28ms and 40ms 2. With SOME_CLAUSE=P.PostTime, the execution time is approximately 500ms! 3. Without "order by" clause at all, the execution time is 1.1ms. The respective EXPLAIN QUERY PLAN: 1. order by rank (28..40ms) 0 0 0 SCAN TABLE PostFTS VIRTUAL TABLE INDEX 327713: 0 1 1 SEARCH TABLE Posts AS P USING INTEGER PRIMARY KEY (rowid=?) 0 2 2 SEARCH TABLE Users AS U USING INTEGER PRIMARY KEY (rowid=?) 2. order by P.PostTime (500ms) 0 0 0 SCAN TABLE PostFTS VIRTUAL TABLE INDEX 327681: 0 1 1 SEARCH TABLE Posts AS P USING INTEGER PRIMARY KEY (rowid=?) 0 2 2 SEARCH TABLE Users AS U USING INTEGER PRIMARY KEY (rowid=?) 0 0 0 USE TEMP B-TREE FOR ORDER BY 3. without order by: (1.1ms) 0 0 0 SCAN TABLE PostFTS VIRTUAL TABLE INDEX 327681: 0 1 1 SEARCH TABLE Posts AS P USING INTEGER PRIMARY KEY (rowid=?) 0 2 2 SEARCH TABLE Users AS U USING INTEGER PRIMARY KEY (rowid=?) Obviously, the slow down is because of the "USE TEMP B-TREE FOR ORDER BY". Order by any field other than "rank" and "rowid" makes this query very slow. > On Thu, Feb 8, 2018, 7:14 PM John Found, <johnfo...@asm32.info> wrote: > > > > > I am using FTS5 for pretty complex search in my application, but recently, > > trying to make it even more complex I faced some problems that are more > > general than only FTS5. > > > > I have a forum engine where are several tables for the threads, for the > > posts, for the users etc. At first I want to be able to search in the posts > > text, but moreover, this search have to be limiter to some subset of the > > posts, for example in the posts of a particular thread or posts of some > > user. Also, there are cases where free-text search is not actually > > necessary, for example when I am searching for all posts from a particular > > user. > > > > At first, I tried to create a FTS5 table, containing only the text data > > that need to be searched and then to access it by queries of the type: > > > > select > > some, > > fields > > from > > fts > > left join posts p on p.id = fts.rowid > > left join threads t on t.id = p.threadid > > left join users u on u.id = p.userid > > where > > fts match ?1 and u.nick = ?2 and t.id = ?3 > > order by ORDER > > > > Such queries are pretty fast when there is only fts match directive in the > > where clause. > > But any additional condition added ruins the performance, especially if > > the fts match returns big amount of matches. > > > > Additional problem is the order by clause. If the ORDER BY term is "rank" > > everything works great, but changing it to > > other field (for example the post time in order to get first most recent > > posts) causes huge slow down of the query. > > > > My second attempt was to sacrifice space for speed and to put all > > searchable data in the fts table - post text, the thread titles and the > > usernames. This way, building complex fts queries kind of: > > > > (content: ?1 OR caption: ?2) AND thread: ?3 AND user: ?4 > > > > I can leave only the fts query in the WHERE clause. This way, the search > > is pretty fast, but the huge problem remains > > the ORDER BY clause. Again everything works fine with "rank", but attempts > > to use any other field for sorting, causes > > huge probems: slow downs up to tens of seconds (usual search time is few > > milliseconds) and out of memory errors. > > > > Such problems with this second approach are even more serious than on the > > first approach. i.e. with the second approach everything works fine and > > quick with "rank" order by, and very, very slow and with errors, on any > > other "order by" option. > > > > So, he main question follows: > > > > What is the right way to design such complex search systems, based on FTS? > > How to properly approach the sorting of the search results in order to not > > have so big slowdowns and out of memory errors. > > > > Any tips are highly welcome! > > > > Regards > > -- -- http://fresh.flatassembler.net http://asm32.info John Found <johnfo...@asm32.info> _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users