"Lukas Haase" <[email protected]> wrote in
message news:[email protected]
> I have a database containing thousands of HTML pages ("topics"). There
> is a fulltext index for these topics. First there is a table
> containing all single words. Each word is identified by its
> "fulltextID":
>
> CREATE TABLE fulltext(
> fulltextID INTEGER PRIMARY KEY,
> word VARCHAR(100) COLLATE NOCASE
> );
>
> Now there is a linking table between the words and the HTML pages
> (topics):
>
> CREATE TABLE topic_fulltext(
> topicID INTEGER,
> fulltextID INTEGER,
> PRIMARY KEY(topicID, fulltextID)
> );
>
> Finding a topic containing a specific word is not too hard:
>
> SELECT topic_fulltext.topicID
> FROM fulltext
> JOIN topic_fulltext ON topic_fulltext.fulltextID = fulltext.fulltextID
> WHERE word LIKE 'Word%';
>
> But now I want to be able to search with more complex queries. For
> example:
>
> * List all topics containing (word1 AND word2)
You could do something like this:
SELECT topic_fulltext.topicID FROM topic_fulltext
where exists (select 1 from fulltext
WHERE topic_fulltext.fulltextID = fulltext.fulltextID and word LIKE
'word1%')
and exists (select 1 from fulltext
WHERE topic_fulltext.fulltextID = fulltext.fulltextID and word LIKE
'word2%')
-- or
SELECT topic_fulltext.topicID FROM topic_fulltext
where fulltextID in (
select fulltextID from topic_fulltext where word LIKE 'word1%'
intersect
select fulltextID from topic_fulltext where word LIKE 'word2%');
Test it, see which one works faster.
> * List all topics containing (word1 OR word2)
Similar to above, but replace AND with OR, and INTERSECT with UNION ALL.
> * List all topics containing (word1 AND word2 AND ... AND word10)
> * List all topics containing ((word1 OR word2) AND word3 OR word3)
The approach above should work for any boolean combination.
Igor Tandetnik
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users