[Robots] Indexed keywords

Jean-Marc Fontaine 22 Nov 2001 16:27:54 -0000

Hello,

I am building my database for the spider to fill but I have a problem.

I am trying to make a thematic search engine. I would like to let the user
to make complex searches such as "the red rabbit".
But since it is intended to be used on any web server even free ones (It
will be a light search engine system released under GPL), it has to consume
the fewer ressources possible. So I just can not store the whole page as
Yahoo or big search engines do.

At first I thought about indexing only the words that seem relevants but
this way I can only make simple searches (ie : "rabbit"). Then I thought
about Indexing with the word, the previous one and the next one. This way I
should be able to make complex searches even on more than 3 words since each
new word can find next on or previous one and so on.  eg : the -> red ->
rabbit -> with -> a -> big -> tail

It seems quite a good way to do it but since I would like to avoid indexing
"noise words" such as "the" or "a" it is not really satisfying.*


Does anyone has suggestions on how to achieve this ?
A database scheme would be perfect in fact :-)

Thanks a lot for your time.

Best regards

Jean-Marc


--
This message was sent by the Internet robots and spiders discussion list 
([EMAIL PROTECTED]).  For list server commands, send "help" in the body of a message 
to "[EMAIL PROTECTED]".

--
This message was sent by the Internet robots and spiders discussion list 
([EMAIL PROTECTED]).  For list server commands, send "help" in the body of a message 
to "[EMAIL PROTECTED]".
[Robots] Indexed keywords

Reply via email to