Sadahiro Tomoyuki wrote: > On Mon, 29 Mar 2004 23:44:00 +0100 > Rich <[EMAIL PROTECTED]> wrote: > >> Using the multi-lingual server scenario I was initially discussing, would >> one of the following usages be correct (yes, it's just pseudocode and >> exists in a world where no errors ever occur!): > > Though I have not worked with any multitasking application, > I suppose a possible snag is the size of DUCET (the file named > allkeys.txt) which should cause slowness of construction of > a collator and large memory use for storage.
Yes, the size of allkeys.txt is an issue - I did a Data dump of a Unicode::Collate instance and it's pretty big! >> 1) >> >> my %collators; >> >> for ( $server_loop ) >> { >> my $lang_tag = Server->requested_lang_tag; >> >> my $collator = $collators{$lang_tag} >> ||= Unicode::Collate::Locale->new(locale => $lang_tag); >> >> ... >> } > > 1) creates a new collator if $lang_tag value is new. > Say when the old one was 'en' (English) and the new one was 'it' > (Italian), Unicode::Collate::Locale->new will return a default collator > each time. I.e. $collators{en} and $collators{it} work as same but memory > is not shared. Good point! > When Unicode::Collate->new is called, all the data generated by parsing > of a table file are stored in a collator which is a blessed hash. > The reason why so is, as I thinked, if (a part of) data newly created > are stored in other places, say, in a cache at the package namespace > (e.g. something like %Unicode::Collate::Cache), it might cause some > problem on handling memory in the cache by users outside the package. > > I think parhaps it should be necessary that a user can determine > whether two (or more) $lang_tag values create the same collator or not. > > my $lang_tag = Server->requested_lang_tag; > my $canonical = Unicode::Collate::Locale::canonical_name($lang_tag); > > # if $canonical is same as an old one, the collator for it should be > # same. After seeing if $canonical is new, a collator can be created. > # The function name leaves room for reconsideration. Yes, makes sense, but I'm starting to wonder if Unicode::Collate is too heavyweight a solution. Perhaps something based around Sort::ArbBiLex might produce good enough results for most languages. Thanks for the reply -- Rich [EMAIL PROTECTED]