We are happy to announce most long-awaited release of ASPSeek since 0.9.9. Delay in releasing 1.2.5 was caused by several consecutive reasons, including desire to finish man pages, my illness and british group Depeche Mode visit to Russia :) This release adds some nifty features, as well as bugfixes. NEW FEATURES: * UtfStorage parameter has been implemented, see etc/aspseek.conf-dist and etc/searchd.conf-dist for details This is a new storage mode, or, strictly speaking, an option to UNICODE storage mode. In this mode, called UTFStorage, the wordurl[1].word contents is not in plain 2-byte unicode, but in UFT-8 charset. >From www.utf8.org: > UTF-8 encodes each Unicode character as a variable number > of 1 to 6 octets, where the number of octets depends on the > integer value assigned to the Unicode character. It is an > efficient encoding of Unicode documents that use US-ASCII > characters because it represents each character in the range > U+0000 through U+007F as a single octet. So, if most of the words indexed are from ASCII, you will have twice smaller database size, smaller memory consumption by the ASPSeek and increase in indexing/searching speed. If you want to switch to UFT, a converter from "old" unicode to utf (index -b) can be used. * Added man pages: aspseek(7), aspseek.conf(5), s.cgi(1), s.htm(5), searchd(1), searchd.conf(5) and removed some files from doc/ subdirectory For new users, this will be the most visible change. Now ASPSeek have a full set of man pages, everything is written and checked carefully. You can see these pages in many different formats (HTML, txt, PostScript, PDF) at http://www.aspseek.org/manual.html. The only page missing in index(1), hope it will be ready for the next release; if anybody got the time to help finishing it, please contact me. * Significantly reduced memory required for multibyte dictionaries in both "searchd" and "index" If you use multibyte dictionaries (like one for chinese provided in tarball), you will notice that in 1.2.5 it requires much less memory than before due to changing of its internal representation. * Added -R switch to "searchd" for auto-restarting in case of SEGV Now you can run searchd -DR and have nonstop searchd. Well, it will rarely crash, but then auto-restart will occur. * Added MaxDocsAtOnce parameter (see etc/aspseek.conf-dist for details) Old index behaviour in case of indexing multiple sites was retrieve one document from site, then switch to another site. This is not very optimal in case you have very many different sites to index, because word, href and dns caches are believed to suffer more in this case.So, option MaxDocsAtOnce was added to aspseek.conf, and it should improve the speed of indexing if you have many many sites. * Added field "urlword.origin" to crc index in case fast clones are enabled This was a mistake that the field was not included in the index; so, new version should do clone lookup a little faster * Added $$ processing to templates code * Added ASPSEEK_TEMPLATE environment variable processing to "s.cgi" * Removed OnlineGeo parameter from aspseek.conf (as it does not work) * Rewrote aspseek-mysql-postinstall to be more verbose * Changed calls to bzero() to memset() for better portability * Added langmap file for German language Thanks goes to Andre Pfeiler <[EMAIL PROTECTED]> for langmap file contributed. BUGS FIXED: * Fixed rare memleak in searchd * Fixed searchd coredump under FreeBSD * Fixed loading of oracle8 driver * Fixed searchd codedumps caused by searching some phrases or word lists * Fixed searchd coredumps caused by very high value of "np" parameter * Fixed default HTTPS port number * Minor fix when HTTPS support is not compiled in * Fixed several bugs related to robots.txt processing * Added non-absolute URL support and whitespace stripping to redirect handling in "index" * Added whitespace stripping from HREF * Redirect URI is not lowercased now Last 6 items from the above list was submitted by Matt Sullivan <[EMAIL PROTECTED]>. This is a bunch of fixes, so I have removed Matt's name from THANKS file and added him to AUTHORS, under "Major Contributors". Thanks for your work, Matt, and hope to see more patches from you! Also I would like to say thanks to John Capo <[EMAIL PROTECTED]> who found one ASPSeek problem on FreeBSD, and posted a solution. Chances of ASPSeek working on FreeBSD is much higher now, when we fixed some small but nasty bugs in searchd. Download sources from www.aspseek.org. Binary packages will be available in the next few days. -- [EMAIL PROTECTED] ICQ 7551596 Phone +7 903 6722750 Reality always seems harsher in the early morning. --
