Re: [PHP] Slow searches in large database

2003-10-16 Thread Cesar Cordovez
Manu, et al,

The database in alenet.com is rather tiny (about 28 docs, as you 
mentioned).  I developed the procedure for a intranet with some 5000 
records.  It is working fine.  It has some 20+ clients that are 
constantly hitting the database.  The server is a desktop (nothing out 
of the ordinary, running win32)



Manuel Vázquez Acosta wrote:
I used a variant of this scheme a time ago and it worked well though I had
not reached 2 000 records.
I separated the words using strtok().

Manu.

PD: How large is the http://www.alenet.com DB; I searched the word 'the'
(which is likely to be in every english doc) and it returned only 28 docs.
Manu.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Slow searches in large database

2003-10-16 Thread Cesar Cordovez
Curt, et al:

You just described my procedure!  I do exacly that, plus, I also save a 
soundex or metaphone code for each word so I can check spelling and 
closest matches.

Cesar.

Curt Zirzow wrote:
Instead of just adding a word to table of words, you add a field
that holds the qty of times it appears in the document.  So now the
sql looks something like:
select qty, docid  from keywords where word='people' order by qty

With an index on qty, the query should be rather fast. And now you
can join the keyword table and the main document table together so
you can display the results:
select k.qty, doc.*  from keywords k, documents doc 
  where k.docid = doc.id k.word='people' 
  order by k.qty

I do this very exact thing for my graphs and reports listed below.

Curt
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Slow searches in large database

2003-10-15 Thread Manuel Vázquez Acosta
I used a variant of this scheme a time ago and it worked well though I had
not reached 2 000 records.

I separated the words using strtok().

Manu.

PD: How large is the http://www.alenet.com DB; I searched the word 'the'
(which is likely to be in every english doc) and it returned only 28 docs.

Manu.

Cesar Cordovez [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Hi!

 I think that you will need a keyword table to speed up this procedure.
   The basic idea is to create an index of words in your fields,
 therefore you will not use like but = making things run much, much
 faster.

 The steps for doing this are:

 1.  Every time you save a record in the table docs, take all the text
 fields in it and separate it into words.  You can use explode to do
 this.  Like for example: $words = explode( , $record[fieldone] .  
 .$record[fieldtwo], {etc, etc});

 2. save in the keyword table the non repeating words in the array with a
 reference to the original document, for example the document id.

 3.  Then, if you want to search for, let say, people you will do:

 select distinct(docid) from keywords where word='people'

 and you will have a list (cursor) with all the documents that have the
 word people.

 Very fast.

 For an example of this, go to http://www.alenet.com and search for
 something.  With very little more effort you can add a spelling thingy
 (type peeple on the search field) and porcentajes for each document.

 I use the exactly the same procedure in a 5000 record database, and the
 time it uses to search is about the same that in alenet.com

 Cesar


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Slow searches in large database

2003-10-15 Thread Curt Zirzow
* Thus wrote Cesar Cordovez ([EMAIL PROTECTED]):
 
 2. save in the keyword table the non repeating words in the array with a 
 reference to the original document, for example the document id.
 
 3.  Then, if you want to search for, let say, people you will do:
 
 select distinct(docid) from keywords where word='people'
 
 and you will have a list (cursor) with all the documents that have the 
 word people.

To take it a step further, get a count of how many matches are
made so there is some sort of relevency:

select count(*) as qty, docid  from keywords where word='people'
group by docid order by qty

Doing this however will require the db to hit the disk (tmpfile)
and do sorting and grouping, which is a bad thing. So I'll take it
a bit further :)

Instead of just adding a word to table of words, you add a field
that holds the qty of times it appears in the document.  So now the
sql looks something like:

select qty, docid  from keywords where word='people' order by qty

With an index on qty, the query should be rather fast. And now you
can join the keyword table and the main document table together so
you can display the results:

select k.qty, doc.*  from keywords k, documents doc 
  where k.docid = doc.id k.word='people' 
  order by k.qty


I do this very exact thing for my graphs and reports listed below.

Curt
-- 
My PHP key is worn out

  PHP List stats since 1997: 
  http://zirzow.dyndns.org/html/mlists/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Slow searches in large database

2003-10-14 Thread Rory McKinley
Hi Adrian

Somewhere in the back of my mind..I remember reading that if you are using
LIKE putting in wildcards for and aft e.g %string% slows down queries
somewhat
I will have a look at my mySQL stuff and see if I can confirm...but if I
recall correctly, is there any way you can do away with one of the
wildcards?

Rory McKinley
Nebula Solutions
+27 82 857 2391
[EMAIL PROTECTED]
There are 10 kinds of people in this world,
those who understand binary and those who don't (Unknown)
- Original Message - 
From: Adrian Teasdale [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tuesday, October 14, 2003 12:54 AM
Subject: [PHP] Slow searches in large database


 Hi there

 Wondering if someone could help or give some advice.

 We have a mysql database that has approximately 20,000 records and has a
 total size of 125mb  There are approximately 25 fields that we need to
 search each time that someone performs a search.  We have installed
 TurckMMCache onto the server which speeded up the searching, but it
 still takes around 15 seconds for the results to be displayed.

 An example of one of our search strings is:

 select docs.* from docs where 1 and CY IN ('GB')  and (TI like
 '%searchstring%' or PD like '%searchstring%' or ND like '%searchstring%'
 or DR like '%searchstring%' or DS like '%searchstring%' or DD like
 '%searchstring%' or DT like '%searchstring%' or RN like '%searchstring%'
 or HD like '%searchstring%' or TD like '%searchstring%' or NC like
 '%searchstring%' or PR like '%searchstring%' or RP like '%searchstring%'
 or AA like '%searchstring%' or TY like '%searchstring%' or AC like
 '%searchstring%' or PC like '%searchstring%' or RC like '%searchstring%'
 or RG like '%searchstring%' or AU like '%searchstring%' or TW like
 '%searchstring%' or CO like '%searchstring%' or AB like '%searchstring%'
 or TX like '%searchstring%')

 Basically, is there anything that anyone can immediately suggest that we
 need to do to speed things up?

 Thanks

 Ade

 -- 
 PHP General Mailing List (http://www.php.net/)
 To unsubscribe, visit: http://www.php.net/unsub.php




-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Slow searches in large database

2003-10-14 Thread Cesar Cordovez
Hi!

I think that you will need a keyword table to speed up this procedure. 
 The basic idea is to create an index of words in your fields, 
therefore you will not use like but = making things run much, much 
faster.

The steps for doing this are:

1.  Every time you save a record in the table docs, take all the text 
fields in it and separate it into words.  You can use explode to do 
this.  Like for example: $words = explode( , $record[fieldone] .   
.$record[fieldtwo], {etc, etc});

2. save in the keyword table the non repeating words in the array with a 
reference to the original document, for example the document id.

3.  Then, if you want to search for, let say, people you will do:

select distinct(docid) from keywords where word='people'

and you will have a list (cursor) with all the documents that have the 
word people.

Very fast.

For an example of this, go to http://www.alenet.com and search for 
something.  With very little more effort you can add a spelling thingy 
(type peeple on the search field) and porcentajes for each document.

I use the exactly the same procedure in a 5000 record database, and the 
time it uses to search is about the same that in alenet.com

Cesar

An example of one of our search strings is:

select docs.* from docs where 1 and CY IN ('GB')  and (TI like
'%searchstring%' or PD like '%searchstring%' or ND like '%searchstring%'
or DR like '%searchstring%' or DS like '%searchstring%' or DD like
'%searchstring%' or DT like '%searchstring%' or RN like '%searchstring%'
or HD like '%searchstring%' or TD like '%searchstring%' or NC like
'%searchstring%' or PR like '%searchstring%' or RP like '%searchstring%'
or AA like '%searchstring%' or TY like '%searchstring%' or AC like
'%searchstring%' or PC like '%searchstring%' or RC like '%searchstring%'
or RG like '%searchstring%' or AU like '%searchstring%' or TW like
'%searchstring%' or CO like '%searchstring%' or AB like '%searchstring%'
or TX like '%searchstring%')
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Slow searches in large database

2003-10-14 Thread Jason Sheets
Rather than searching every field for every search I usually provide a 
select drop down or checkbox that allows the user to indicate what 
information they are searching, then only hit those fields in the SQL 
query. 
MySQL is fast for simple queries but it doesn't scale well with larger 
databases or complex queries, you might look at building an index like 
someone mentioned but that will eat more disk space as well.  Make sure 
you are using indexes efficiently and re-evaulate your search system to 
deterime if you really need to search what you are searching.

You may also want to google for mysql optmization hacks to help tune the 
mysql daemon itself. 

Jason



Cesar Cordovez wrote:

Hi!

I think that you will need a keyword table to speed up this 
procedure.  The basic idea is to create an index of words in your 
fields, therefore you will not use like but = making things run 
much, much faster.

The steps for doing this are:

1.  Every time you save a record in the table docs, take all the text 
fields in it and separate it into words.  You can use explode to do 
this.  Like for example: $words = explode( , $record[fieldone] .  
 .$record[fieldtwo], {etc, etc});

2. save in the keyword table the non repeating words in the array with 
a reference to the original document, for example the document id.

3.  Then, if you want to search for, let say, people you will do:

select distinct(docid) from keywords where word='people'

and you will have a list (cursor) with all the documents that have the 
word people.

Very fast.

For an example of this, go to http://www.alenet.com and search for 
something.  With very little more effort you can add a spelling thingy 
(type peeple on the search field) and porcentajes for each document.

I use the exactly the same procedure in a 5000 record database, and 
the time it uses to search is about the same that in alenet.com

Cesar

An example of one of our search strings is:

select docs.* from docs where 1 and CY IN ('GB')  and (TI like
'%searchstring%' or PD like '%searchstring%' or ND like 
'%searchstring%'
or DR like '%searchstring%' or DS like '%searchstring%' or DD like
'%searchstring%' or DT like '%searchstring%' or RN like 
'%searchstring%'
or HD like '%searchstring%' or TD like '%searchstring%' or NC like
'%searchstring%' or PR like '%searchstring%' or RP like 
'%searchstring%'
or AA like '%searchstring%' or TY like '%searchstring%' or AC like
'%searchstring%' or PC like '%searchstring%' or RC like 
'%searchstring%'
or RG like '%searchstring%' or AU like '%searchstring%' or TW like
'%searchstring%' or CO like '%searchstring%' or AB like 
'%searchstring%'
or TX like '%searchstring%')


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Re: [PHP] Slow searches in large database

2003-10-14 Thread Mohamed Lrhazi
You might wanna see if a search engine/indexing tool can help you, I
use:

http://search.mnogo.ru/features.html


On Mon, 2003-10-13 at 18:54, Adrian Teasdale wrote:
 Hi there
 
 Wondering if someone could help or give some advice.
 
 We have a mysql database that has approximately 20,000 records and has a
 total size of 125mb  There are approximately 25 fields that we need to
 search each time that someone performs a search.  We have installed
 TurckMMCache onto the server which speeded up the searching, but it
 still takes around 15 seconds for the results to be displayed.  

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] Slow searches in large database

2003-10-13 Thread Adrian Teasdale
Hi there

Wondering if someone could help or give some advice.

We have a mysql database that has approximately 20,000 records and has a
total size of 125mb  There are approximately 25 fields that we need to
search each time that someone performs a search.  We have installed
TurckMMCache onto the server which speeded up the searching, but it
still takes around 15 seconds for the results to be displayed.  

An example of one of our search strings is:

select docs.* from docs where 1 and CY IN ('GB')  and (TI like
'%searchstring%' or PD like '%searchstring%' or ND like '%searchstring%'
or DR like '%searchstring%' or DS like '%searchstring%' or DD like
'%searchstring%' or DT like '%searchstring%' or RN like '%searchstring%'
or HD like '%searchstring%' or TD like '%searchstring%' or NC like
'%searchstring%' or PR like '%searchstring%' or RP like '%searchstring%'
or AA like '%searchstring%' or TY like '%searchstring%' or AC like
'%searchstring%' or PC like '%searchstring%' or RC like '%searchstring%'
or RG like '%searchstring%' or AU like '%searchstring%' or TW like
'%searchstring%' or CO like '%searchstring%' or AB like '%searchstring%'
or TX like '%searchstring%')

Basically, is there anything that anyone can immediately suggest that we
need to do to speed things up?

Thanks 

Ade

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Slow searches in large database

2003-10-13 Thread Chris W. Parker
Adrian Teasdale mailto:[EMAIL PROTECTED]
on Monday, October 13, 2003 3:54 PM said:

 An example of one of our search strings is:

[snip]

 Basically, is there anything that anyone can immediately suggest that
 we need to do to speed things up?

1. You could try changing 'docs.*' to a verbose list of each field you
want to SELECT. I've been told through numerous sources that even if you
are ultimately going to SELECT each field in a table it's still faster
to explicitly name those fields because the server does not need to
query the table ahead of time.

2. Why don't you explain what your database is used for and what each
field does? It's possible that there may be a more efficient way to
write your table thus resulting in faster search times.

3. I don't know much about indexes but are there indexes on each field
that you are searching? As far as I know indexes are supposed to be
searching up.


hth,
chris.

--
Don't like reformatting your Outlook replies? Now there's relief!
http://home.in.tum.de/~jain/software/outlook-quotefix/

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Slow searches in large database

2003-10-13 Thread Miles Thompson
Start by checking the MySQL docs to find out if indexes are used with IN 
and LIKE especially as the latter is using wildcards. I suspect not.

Given that you are essentially performing a  sequential read of the 
database and checking on all these fields your performance is remarkably good.

Soltuion? Use indexes wherever possible and refine you query. Do you have 
to search on all those fields, is there any way of deriving a subset?

Alternately, investigate MySQL full-text search and see if you can put that 
to work for you.

I also recommend posting this question on the MySQL list as it's really a 
db, not a PHP question.

Cheers - Miles Thompson

At 11:54 PM 10/13/2003 +0100, Adrian Teasdale wrote:
Hi there

Wondering if someone could help or give some advice.

We have a mysql database that has approximately 20,000 records and has a
total size of 125mb  There are approximately 25 fields that we need to
search each time that someone performs a search.  We have installed
TurckMMCache onto the server which speeded up the searching, but it
still takes around 15 seconds for the results to be displayed.
An example of one of our search strings is:

select docs.* from docs where 1 and CY IN ('GB')  and (TI like
'%searchstring%' or PD like '%searchstring%' or ND like '%searchstring%'
or DR like '%searchstring%' or DS like '%searchstring%' or DD like
'%searchstring%' or DT like '%searchstring%' or RN like '%searchstring%'
or HD like '%searchstring%' or TD like '%searchstring%' or NC like
'%searchstring%' or PR like '%searchstring%' or RP like '%searchstring%'
or AA like '%searchstring%' or TY like '%searchstring%' or AC like
'%searchstring%' or PC like '%searchstring%' or RC like '%searchstring%'
or RG like '%searchstring%' or AU like '%searchstring%' or TW like
'%searchstring%' or CO like '%searchstring%' or AB like '%searchstring%'
or TX like '%searchstring%')
Basically, is there anything that anyone can immediately suggest that we
need to do to speed things up?
Thanks

Ade

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php