Template Version: @(#)sac_nextcase 1.66 04/17/08 SMI
This information is Copyright 2008 Sun Microsystems
1. Introduction
    1.1. Project/Component Working Name:
         Adding Malaysian, Indonesian, Vietnamese UTF-8 locales
    1.2. Name of Document Author/Supplier:
         Author:  Wei Xue
    1.3  Date of This Document:
        07 July, 2008

2. Project Summary
   2.1. Project Description:
        Add the following South East Asia locales support to current 
        OpenSolaris.
        Malaysian:   ms_MY.UTF-8 
        Indonesian:  id_ID.UTF-8
        Vietnamese:  vi_VN.UTF-8 
        
        Add VISCII and TCVN code conversion feature in iconv library. So that
        the bidirectional conversion between VISCII,TCVN, and UTF-8/UCS4(2) 
        will be supported in iconv(1) and iconv(3).

3. Business Summary
   3.5. Opportunity Window/Exposure:
        Solaris Nevada 
        Project Indiana 

4. Technical Description
   4.1. Details:
        1>
        To create a new locale support in Solaris, following locale data 
        need to be defined.
        LC_CTYPE
        LC_COLLATE
        LC_NUMERIC
        LC_TIME
        LC_MONETARY
        LC_MESSAGE
         
        Because CLDR(Unicode's Common Locale Data Repository)[1] is so far the
        largest and most extensive standard repository of locale data. The  
        new UTF-8 locales: ms_MY.UTF-8, id_ID.UTF-8 and vi_VN.UTF-8 will be 
        created with standard locale data according to CLDR.
  
        The l10n(localization) messages of these three locale languages are not
        covered by this project. As a reference, here are some information 
about l10n
        messages status for these three languages:
        
          * CLI (Command Line Interface) messages:
            The localization for Solaris system libraries and utilities in 
Malaysian,
            Indonesian and Vietnamese are not available.
            
          * L10N status for major GUI components by communities

                              Malaysian    Indonesian    Vietnamese
            --------------------------------------------------------------
            Gnome[3]          Yes          Yes           Yes
            Firefox [10]      N/A          Yes           N/A
            Thunderbird [11]  N/A          N/A           N/A
            Openoffice [14]   Yes [12]     N/A           Yes [13]

            The localization contents for these three languages of gnome had 
been  
            integrated in package SUNWgnome-l10nmessages-extra on Solaris.
         
        2>
        To enhance iconv modules for Vietnamese encodings:

        The most popular encoding standards for Vietnamese are :
        VISCII [8]
        TCVN(5712) [7]
        CP1258 [9]
        
        CP1258 is the standard of Microsoft Windows. It is supported by current 
        Solaris iconv.
        
        If Vietnamese locale is supported, the VISCII, TCVN encoding conversion
        should be supported as well. It includes details as following:
        VISCII <-> UTF-8/UCS-4/UCS-4BE/UCS-4LE/UCS-2/UCS-2BE/UCS-2LE
        TCVN  <-> UTF-8/UCS-4/UCS-4BE/UCS-4LE/UCS-2/UCS-2BE/UCS-2LE
        VISCII <-> TCVN
        (Note: <-> means from and to.)
  
        Since Malaysia and Indonesia language characters belong to ISO8859-1[2]
        standard, they do not need extra iconv modules.
        
        
   4.5. Interfaces:
        1>
        For function:
        iconv_t iconv_open(const char *tocode, const char *fromcode)
        The parameters tocode and fromcode will support: VISCII, TCVN(TCVN5712).

        2> 
        For utility: 
        iconv [-cs] -f frommap -t tomap [file]...
        "frommap" and "tomap" will support VISCII, TCVN(TCVN5712).
 

   4.6. Doc Impact:
        None.

   4.7. Admin/Config Impact:
        None.

   4.8. HA Impact:
        None.

   4.9. I18N/L10N Impact:
        No impact to current XI18N.

   4.10. Packaging & Delivery:
        locale enabling packages (new packages):
        SUNWlang-ms
        SUNWlang-id
        SUNWlang-vi

        iconv packages (updated packages):
        SUNWiconv-extra
        SUNWiconv-unicode

   4.11. Security Impact:
        None.

   4.12. Dependencies:       


5. Reference Documents:
   [1] Unicode CLDR Project: Common Locale Data Repository
      http://unicode.org/cldr
   
   [2] ISO/IEC 8859-1:1998:
      http://anubis.dkuug.dk/JTC1/SC2/WG3/docs/n411.pdf

   [3] Gnome l10n message language list:
      http://l10n.gnome.org/languages/

   [4] Vietnamese Unicode FAQ:
      http://vietunicode.sourceforge.net/

   [5] Vietnamese encoding conversion-tables: 
      http://www.haible.de/bruno/charsets/conversion-tables/Vietnamese.html

   [6] The TCVN 6909 standard:
      http://www.informatik.uni-leipzig.de/~duc/software/misc/tcvn6909.pdf

   [7] TCVN 5712:1993 standard:
      http://www.informatik.uni-leipzig.de/~duc/software/misc/tcvn.txt

   [8] rfc1456 - Conventions for Encoding the Vietnamese Language
      http://tools.ietf.org/html/rfc1456

   [9] Windows 1258 reference:
      http://www.microsoft.com/globaldev/reference/sbcs/1258.mspx
      
   [10] Firefox is available in over 45 languages:   
      http://www.mozilla.com/en-US/firefox/all.html

   [11] Available Thunderbird languages: 
      http://www.mozilla.com/en-US/thunderbird/all.html

   [12] Malaysian openoffice website:   
      http://ms.openoffice.org/
      
   [13] Vietnamese openoffice website:      
      http://vi.openoffice.org/

   [14] Language localization status of openoffice
      http://wiki.services.openoffice.org/wiki/Languages

6. Resources and Schedule
    6.4. Steering Committee requested information
        6.4.1. Consolidation C-team Name:
                Globalization
    6.5. ARC review type: FastTrack
    6.6. ARC Exposure: open


Reply via email to