since aspseek 1.2.5 there is the new utf-8 storage mode.
it says that utf-8 will reduce memory and harddrive space and increase indexing and
search speed. now i have two questions:1) with index -b can an existing index/database be converted to utf-8. can this be
done with a productive index? you also have to set up a special config entry in
searchd/aspseek.conf. what happens during conversion? can the database still
be searched or do i have a downtime. at the moment we have indexed at about
150.000 webpages.
Old database can be still searched during conversion process.
But search will be interruped for a short time when MySQL tables will
be renamed and "searchd" will be restarted
Ascii chars in those words will be encoded by 1 byte and umlauts will be encoded as 2 bytes.
2) utf-8 will be most efficient with us-ascii. what happens when there are also
words with special chars like the german umlauts. will there still be all those
improvements?
In any way, size of table "wordurl" will be less compared to plain unicode storage.
Markus Rietzler
* kommunikation & online service
* RZF NRW
* Tel: 0211.4572-130
