Consider to download 3.1.xx. It has phrase support as well
as so called "cache mode" which is able to search very quickly
through millions documents.
Shane Wegner wrote:
>
> I am using UDMSearch 3.0.23 with MySQL at the moment and
> have a few suggestions regarding the software. Please note
> that I have only friefly looked at the developement version
> so some of these may already be implemented.
>
> I am using crc-multi for data storage and indexer seems to
> slow down quite a lot when there are about 75000 URLs in
> the database. This site has about 125000 to index. I
> noticed it deletes from the various ndict tables even when
> the URL had not previously been indexed. Is that really
> necessary or couldn't it just delete if status != 0?
>
> Also, MySQL seems to really slow down when a table has more
> than 2,000,000 rows. Do you think there would be any
> performance increase in splitting the various ndict table
> into say four tables? Say ndict4-1, ndict4-2, ndict4-3,
> and ndict4-4 and just use the first two bits of the crc32
> value to determine where it goes. I'm sure there's a point
> where you have too many tables and performance can suffer
> the other way but that should cut the size of each ndict
> table by 4.
>
> My URLs primarily look like this:
> http://www.cm.nu/~shane/lists/destin/2001-03/xxxxxxx.html
> Do you thing there is a way to store the host/path part of
> an URL seperately from the referenced file. It seems
> redundant to be storing protocol://host/path/filename over
> and over when often, parts of these keep repeating.
>
> Last thing, as far as I know, there is no way I can search
> for an quoted string, for example, "Mail User Agent" won't
> necessarily come up with pages containing that exact
> string. I thing this could be implemented fairly simple by
> adding an int to the ndict and dict tables specifying the
> word number of the referenced word. It would mean storing
> multiple words in a single document more than once but
> would allow this to work.
___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]