On Saturday 12 November 2005 16:04, Michael Segel wrote:
> On Saturday 12 November 2005 10:28, Michael Segel wrote:
> > CREATE TABLE tblSearchDictionary
> > (
> > objectId INT NOT NULL,
> > objectType INT NOT NULL,
> > wordLocation INT NOT NULL,
> > word VARCHAR(64) NOT NULL,
> > CONSTRAINT CONSd0e222 UNIQUE
> > (objectId,objectType,word,wordLocation)
> > );
> >
> > > This table has an index on each of the four columns, it also has the
> > > unique index across all four columns as defined above:
>
> As a follow up...
> Change the constraint order to UNIQUE(objectId, word, wordLocation,
> objectType);
>
> You probably can drop the index I asked you to create if you make this mod.
> Then your query should use the unique constraint's "backing index"...
>
> HTH.
>
> -G
If you want to make this a little bit more efficient, you could try the
following:
1) Create two tables.
CREATE TABLE wordKey
(
wordID INT NOT NULL GENERATED ALWAYS AS IDENTITY,
word VARCHAR(128) NOT NULL,
CONSTRAINT wordKeyCONS01 UNIQUE (wordID, word)
);
// Note this should create the backing INDEX so you don't need to create an
INDEX
CREATE TABLE tblSearchDictionary
(
objectId INT NOT NULL,
objectType INT NOT NULL,
wordLocation INT NOT NULL,
wordID INT NOT NULL,
CONSTRAINT CONSd0e222 UNIQUE (objectId,wordID, wordLocation,
objectType)
);
// Note: I didn't see the SQL Syntax for creating a foreign key on wordID but
you'll want to do that and to also for cascading deletes, even though you may
not ever delete a word from wordKey...
It looks like you're trying to build your own word search index of documents.
If you you'll want to automate it rather than use static queries.
So lets say you're getting a text input of "CONSTRAINT OR BAR AND FOO".
I would interpret this to mean select documents that either contain the word
"CONSTRAINT" or the words "FOO" AND "BAR".
Note that you're going to create a parse tree that says node one contains 1
item, "CONSTRAINT", node two contains 2 items, "FOO", "BAR"...
So you would want to create a SELECT statement that is equivlent to:
SELECT objectID, SUM(wordLocation) SCORE
FROM tblSearchDictionary
WHERE wordID = xxx
UNION (
SELECT objectID, SUM(wordLocation) SCORE
FROM tblSearchDictionary
WHERE wordID IN (yyy,zzz)
)...
GROUP BY objectID
Where xxx, yyy, zzz are integer values that you got from your look up.
While this isn't the complete solution, it should point you in the correct
direction and should work independent of some kludges in the optimizer.
Your wordKey look up statement would be a prepared statement, while your
actual scoring routine would be a dynamic statement.
This should give you what you want.
I believe that there may be more to this for example what happens if you have
a document that contains all three words ....
I'd also recommend the lookup table as a way to decrease the size of your
overall solution and it may change how your optimizer looks at things.
(You're doing your main query all on ints...)
But hey, what do I know? I am just having my first cup of joe and you get what
you paid for... ;-)
-G
--
Michael Segel
Principal
MSCC