Till, Bess, Ralph, >> assuming the algorithm for enriching the spellings of a word (from PinYin to Chinese) exists, will the result include both forms, PinYin AND Non-PinYin Chinese transliteration and BOTH forms will be indexed? >> the principle of indexing different forms for spelling a vertain word exists in name authority records, where a name (for ex. Pushkin) has over 30 forms of different spellings. The string of different names can be expended (Ralph¹s work with VIAF and with fuzzy logic in WorldCat/identities is definitely relevant here). maybe something is already underway (?) >> how large will the resulting index be? managable for medium-small installations of vuFIND?
Ya¹aqov Ziso, Electronic Resource Management Librarian, Rowan University 856 256 4804 On 10/29/09 11:34 AM, "Till Kinstler" <kinst...@gbv.de> wrote: > Bess Sadler schrieb: > >> > So, thoughts? Anyone know more about this than I do and want to speak up? > > I'd second Demian's and Jonathan's statements: Do that in Solr by using > a Filter (either at indexing or search time). > You want to solve that using an algorithm that translates american > transcription into chinese, correct? If you have that algorithm (is > there one?), it's a perfect job for a filter and I guess there are use > cases outside libraryland as well. It's not only us dealing with > transcription of chinese... > If I misunderstood your approach and you want to use a dictionary to map > the different transcriptions, solr.SynonymFilterFactory could provide a > solution. > (http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilte > rFactory) > > > Till