Consider to download 3.1.xx. It has phrase support  as well
as so called "cache mode" which is able to search very quickly 
through millions documents.



Shane Wegner wrote:
> 
> I am using UDMSearch 3.0.23 with MySQL at the moment and
> have a few suggestions regarding the software.  Please note
> that I have only friefly looked at the developement version
> so some of these may already be implemented.
> 
> I am using crc-multi for data storage and indexer seems to
> slow down quite a lot when there are about 75000 URLs in
> the database.  This site has about 125000 to index.  I
> noticed it deletes from the various ndict tables even when
> the URL had not previously been indexed.  Is that really
> necessary or couldn't it just delete if status != 0?
> 
> Also, MySQL seems to really slow down when a table has more
> than 2,000,000 rows.  Do you think there would be any
> performance increase in splitting the various ndict table
> into say four tables?  Say ndict4-1, ndict4-2, ndict4-3,
> and ndict4-4 and just use the first two bits of the crc32
> value to determine where it goes.  I'm sure there's a point
> where you have too many tables and performance can suffer
> the other way but that should cut the size of each ndict
> table by 4.
> 
> My URLs primarily look like this:
> http://www.cm.nu/~shane/lists/destin/2001-03/xxxxxxx.html
> Do you thing there is a way to store the host/path part of
> an URL seperately from the referenced file.  It seems
> redundant to be storing protocol://host/path/filename over
> and over when often, parts of these keep repeating.
> 
> Last thing, as far as I know, there is no way I can search
> for an quoted string, for example, "Mail User Agent" won't
> necessarily come up with pages containing that exact
> string.  I thing this could be implemented fairly simple by
> adding an int to the ndict and dict tables specifying the
> word number of the referenced word.  It would mean storing
> multiple words in a single document more than once but
> would allow this to work.
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to