Hi all, and happy new year! In the CVS there's (hopefully) a bugfix for a problem searching titles/names with Strange Chars(tm) (i.e. anything but ASCII). The problem was noticeable only for the 'http' and 'mobile' data access systems, using Python 2.4.3 or later, 2.5 included. Try doing some funny-names searches and let me know if everything blows up. :-)
Another thing that I've noticed only today: in the biographies.list file some Strange Chars(tm) are replaced with their XML references (e.g.: ć for an acute accented i). Some are replaced, and some are not. Quite silly. Actually my local mirror of the plain text data files is a bit old, so if someone can run a pair of tests against an up-to-date version, will help to fully understand the situation: See how many lines are affected in the biographies.list file: $ zgrep -c '&#[0-9]\{3,5\};' biographies.list.gz In my local copy there are about 50 lines with &#...; references, in a file with over 4.2mln of lines. See if other files are affected by the same situation: $ zgrep -l '&#[0-9]\{3,5\};' *.gz If only so few lines in a single file are affected, I think it's better to ignore them, rather than pay the overhead of the replacement with the matching unicode char. -- Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel