the logic of full text search
Hi there, I am trying to understand the logic of full text search in mysql. I'm not using mysql 4. The search work OK, be it thast I get hits on certain words, whilst other words are discarded for some reason or other. Why is that. An example: I search in a text field for the word organisation. I get hits. When I search for the word scenario nothing is found. But I can see the word in the paragraphs by myself? Is there an explanation for this? Thanks, Sjef
Re: the logic of full text search
I think right now probably one or two rows will contains the search word. So maybe, when I start entering more rows, the result will be more satisfactory. Sj Sjef Janssen [EMAIL PROTECTED] wrote: No, there are just a few rows in my table, as I am still developing the program. Will it be better when the table is in regular use (and the number of rows will increase)? How many rows in the table? how many rows contain the search word? I am trying to understand the logic of full text search in mysql. I'm not using mysql 4. The search work OK, be it thast I get hits on certain words, whilst other words are discarded for some reason or other. Why is that. An example: I search in a text field for the word organisation. I get hits. When I search for the word scenario nothing is found. But I can see the word in the paragraphs by myself? Is there an explanation for this? AIUI, if a word occurs too many times (in more than x% of rows, I can't remember the logic used) then it's treated as a stop word. This means that words that appear in almost every row (like the, you etc) which would have no value to a search are ignored. I believe this is what's causing your problem. Do you have many records in the table you're doing a fulltext search on? IME it tends to work better with plenty of rows to work with. -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.net http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Egor Egorov / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.net ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: the logic of full text search
No, there are just a few rows in my table, as I am still developing the program. Will it be better when the table is in regular use (and the number of rows will increase)? I believe so. This is what you're seeing - quoted from MySQL manual: -- start quote -- The search for the word MySQL produces no results in the above example, because that word is present in more than half the rows. As such, it is effectively treated as a stopword (that is, a word with zero semantic value). This is the most desirable behaviour -- a natural language query should not return every second row from a 1 GB table. A word that matches half of rows in a table is less likely to locate relevant documents. In fact, it will most likely find plenty of irrelevant documents. We all know this happens far too often when we are trying to find something on the Internet with a search engine. It is with this reasoning that such rows have been assigned a low semantic value in this particular dataset. -- end quote -- Hope that helps! David P -- David Precious http://www.preshweb.co.uk/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: the logic of full text search
No, there are just a few rows in my table, as I am still developing the program. Will it be better when the table is in regular use (and the number of rows will increase)? Sj I am trying to understand the logic of full text search in mysql. I'm not using mysql 4. The search work OK, be it thast I get hits on certain words, whilst other words are discarded for some reason or other. Why is that. An example: I search in a text field for the word organisation. I get hits. When I search for the word scenario nothing is found. But I can see the word in the paragraphs by myself? Is there an explanation for this? AIUI, if a word occurs too many times (in more than x% of rows, I can't remember the logic used) then it's treated as a stop word. This means that words that appear in almost every row (like the, you etc) which would have no value to a search are ignored. I believe this is what's causing your problem. Do you have many records in the table you're doing a fulltext search on? IME it tends to work better with plenty of rows to work with. HTH! David P -- David Precious http://www.preshweb.co.uk/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: the logic of full text search
Hi, I am trying to understand the logic of full text search in mysql. I'm not using mysql 4. The search work OK, be it thast I get hits on certain words, whilst other words are discarded for some reason or other. Why is that. An example: I search in a text field for the word organisation. I get hits. When I search for the word scenario nothing is found. But I can see the word in the paragraphs by myself? Is there an explanation for this? AIUI, if a word occurs too many times (in more than x% of rows, I can't remember the logic used) then it's treated as a stop word. This means that words that appear in almost every row (like the, you etc) which would have no value to a search are ignored. I believe this is what's causing your problem. Do you have many records in the table you're doing a fulltext search on? IME it tends to work better with plenty of rows to work with. HTH! David P -- David Precious http://www.preshweb.co.uk/ -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
Re: the logic of full text search
Sjef Janssen [EMAIL PROTECTED] wrote: No, there are just a few rows in my table, as I am still developing the program. Will it be better when the table is in regular use (and the number of rows will increase)? How many rows in the table? how many rows contain the search word? I am trying to understand the logic of full text search in mysql. I'm not using mysql 4. The search work OK, be it thast I get hits on certain words, whilst other words are discarded for some reason or other. Why is that. An example: I search in a text field for the word organisation. I get hits. When I search for the word scenario nothing is found. But I can see the word in the paragraphs by myself? Is there an explanation for this? AIUI, if a word occurs too many times (in more than x% of rows, I can't remember the logic used) then it's treated as a stop word. This means that words that appear in almost every row (like the, you etc) which would have no value to a search are ignored. I believe this is what's causing your problem. Do you have many records in the table you're doing a fulltext search on? IME it tends to work better with plenty of rows to work with. -- For technical support contracts, goto https://order.mysql.com/?ref=ensita This email is sponsored by Ensita.net http://www.ensita.net/ __ ___ ___ __ / |/ /_ __/ __/ __ \/ /Egor Egorov / /|_/ / // /\ \/ /_/ / /__ [EMAIL PROTECTED] /_/ /_/\_, /___/\___\_\___/ MySQL AB / Ensita.net ___/ www.mysql.com -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]