Hi! Starting from 3.2.0, internally mnogosearch works in Unicode. To add support for a new character set, one need to write charset->unicode and unicode->charset convertion routines. This is very simple for European languages, which are mostly covered by 8bit character set. The only thing we need is charset->unicode mapping table to add a support of simple charset of such kind. Such tables are available from ftp.unicode.org.
ISCII seems to be a simple (almost!!!) charset in this meanning. I found a mapping table for MacGujarati charset here, which is almost the same with ISCII: http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/GUJARATI.TXT Where we can find ISCII mapping table? Please also note this section in mapping file: # Section 1: Map the following byte pairs as indicated: # (ZWNJ means ZERO WIDTH NON-JOINER, ZWJ means ZERO WIDTH JOINER) # (Also see note about 0xF0 in comments above) 0xA1+0xE9 0x0AD0 # GUJARATI OM 0xAA+0xE9 0x0AE0 # GUJARATI LETTER VOCALIC RR 0xDF+0xE9 0x0AC4 # GUJARATI VOWEL SIGN VOCALIC RR 0xE8+0xE8 0x0ACD+0x200C # GUJARATI SIGN VIRAMA + ZWNJ # explicit halant 0xE8+0xE9 0x0ACD+0x200D # GUJARATI SIGN VIRAMA + ZWJ # soft halant So some additional coding (~10-15 minutes) is required to take in account these pairs. If you find convertion map for ISCII, we'll add it into next release. TSCII seems to be a very complex charset. I noticed this reading these two documents: http://www.xfree86.org/pipermail/i18n/2001-August/002246.html http://www.geocities.com/Athens/5180/tscii4.html If you want to contribute the project implementing TSCII support, feel free to send us patches. I would recommend to use the latest CVS sources as a start point, because recoding tools were changed since 3.2.3. Regards! Vathanana Kumarathurai wrote: > Hi, > > TSCII > ==== > Here are few links to TSCII samle texts and fonts needed to see the texts. > > TSCII: Fonts, Keyboard Drivers and Converters > -- http://www.tamil.net/tscii/tools.html > > Sample Web Pages in Tamil based on TSCII format > -- http://www.geocities.com/Athens/5180/tsctst11.html > -- http://www.geocities.com/Athens/5180/devsngs.html > -- http://www.geocities.com/Athens/5180/barati1T.html > > A website using TSCII > -- http://www.aaraamthinai.com/ > > > > ISCII > ==== > ISCII is more tricky though... :( I haven't been able to find much > literature in Tamil based on ISCII. I will get back to you about this later. > > Indian Institute of Information Technology has developed an ISCII plug-in > for the browsers to view the texts in the Indian Script. ISCII Plug-In > -- http://www.iiit.net/ltrc/iscii/index.htm ___________________________________________ If you want to unsubscribe send "unsubscribe general" to [EMAIL PROTECTED]
