Mano wrote: >Thanks Jim.... Progress ! > >But, I am not sure that this is the way to go. >The first byte is always 206 >the second byte is between 145 to 169 for uppercase greek characters.
How are the UTF-8 Unicode characters getting to your GT.M input? Are you typing them directly into a terminal emulator or pasting them in from another window or what? Can you as easily set up your workstation for ISO-8859-7? I was thinking that the ISO-8859-7 encoding might work better for you because that is a one character per byte encoding and with the setup of pattern codes and collation for Greek characters, you might be able to get along without having to make extensive changes to input transforms and such. I expect that UTF-8 or UTF-16 might work better for you overall if you need to work with more than just the two languages, Greek and English, but it will undoubtedly require more extensive modifications to different parts of VistA. For instance, if each Greek character is represented with two bytes, the length limit on many data fields will need to be doubled. I don't know Greek beyond being able to recognize many of the letters of the alphabet and I haven't had occasion to work much outside of English so I don't really know what all the complications might be. It might be helpful to get some input here from our Finnish or German friends. >I went through and and changed my pattern codes. >It works, no errors... but... >I didn't get it to work exactly yet... but I am closer. > >I am not exactly sure if this is the right road though. >I notice that GTM acts funny when I am using the Greek characters >(unicode). For instance deleting and changing characters via the line >editor will often not work properly.. The cursor often gets messed up. Yes, UTF-8 as input to a dumb terminal would seem to be problematic. GT.M is set up for one byte per character and each Greek letter encoded with UTF-8 comes across as 2 bytes. Characters from some languages would be encoded with 4 bytes. This variability of bytes per character would mess up cursor positioning and character deletion etc in a line editor unless the editor was specifically designed to handle it. This is much less of a problem for web based applications because the editing of input values is handled in the web browser and the input values are presented to the server as whole units. >I know mumps doesn't care if a character is 8 bits or 800 bits.... but >apparently, parts of mumps do care, Somebody stated that the MUMPS standard doesn't care. Individual MUMPS implementations do care. Some support 16 bits per character intrinsically, but most support only 8 bits. Because of the way that UTF-8 was designed, I think that GT.M and VistA could support UTF-8 pretty well as data, especially for languages such as Greek that only require 2-byte encodings. UTF-8 was designed to get along well as possible with existing systems oriented around 8-bits per character. It avoids control characters in the bytes of multi-byte characters, character strings collate properly on a byte-by-byte basis, and for a 2-byte character, the continuation byte is unique within a given language. >is there anyway to set character length? >is this something that needs to be done to GTM? This is not currently an option. Adding support for Unicode to GT.M is one of the proposals in the GT.M new features survey that Bhaskar recently announced here. (http://www.sanchez-gtm.com/survey/gtm_survey_05.asp) >Or... as usuall.... am I just missing the boat. > >Manolis > >On Fri, 2005-04-08 at 23:58 -0700, Jim Self wrote: >> I copied your pattern file to "test.pattern" without the comment at the >> bottom and added a >> comment at the top to remind me what it is next time I see it. Then I loaded >> it with no >> problem and tested it: >> >> view "PATLOAD":"test.pattern","PATCODE":"GREEK" >> f i=1:1:255 s c=$c(i) w !,i,?5,c?1a,c?1u,c?1l,c?1n,c?1p,c?1c,?15,c >> >> That gave me an error "Current pattern table has no characters with pattern >> code P", which >> suggests that you need to define PATCODE C as well. >> >> I put in some entries for codes C and P and the test ran to completion. >> >> Set up your environment variables after debugging your pattern file and use >> the view >> commands until then. >> >> Mano wrote: >> >I decided to try the pattable stuff that I was told about >> >I read from pages 40 to 43 in the GTM unix programmers guide and tried >> >to set up my own pattern table >> > >> >I set up my environment variables >> >gtm_pattern_file=/blah/blah/pattern (which is the pattern file) >> >gtm_pattern_table=GREEK >> > >> >then I have the following as my pattern file >> >--------------------------------------------------------- >> >PATSTART >> > PATTABLE GREEK >> > PATCODE L >> > 97,98,99, - >> > 100,101,102,103,104,105,106,107,108,109, - >> > 110,111,112,113,114,115,116,117,118,119, - >> > 120,121,122, - >> > 219, - >> > 220,221,222,223,224,225,226,227,228,229, - >> > 230,231,232,233,234,235,236,237,238,239, - >> > 240,241,242,243,244,245,246,247,248,249, - >> > 250,251,252,253,254 >> > PATCODE N >> > 48,49,50,51,52,53,54,55,56,57 >> > PATCODE U >> > 65,66,67,68,69, - >> > 70,71,72,73,74,75,76,77,78,79, - >> > 80,81,82,83,84,85,86,87,88,89,90, - >> > 181,182,184,185,186,188,191, - >> > 193,194,195,196,197,198,199, - >> > 200,201,202,203,204,205,206,207,208,209, - >> > 210,211,212,213,214,215,216,217,218 >> > >> >PATEND >> >;it says that there is a syntax error on PATEND >> >--------------------------------------------------------- >> >I run vista >> >I enter VIEW "PATLOAD":"pattern" and get the following error >> > >> >GTM>VIEW "PATLOAD":"pattern"Cannot load table GREEK twice >> > >> >%GTM-E-PATTABSYNTAX, Error in pattern at line 24 >> > >> >GTM>VIEW "PATCODE":"GREEK" >> > >> >GTM> >> >---------------------------------------------------- >> > >> > >> >I am sure it is something simple but my eyes have gone all blurry. >> > >> >Thanks! --------------------------------------- Jim Self Systems Architect, Lead Developer VMTH Computer Services, UC Davis (http://www.vmth.ucdavis.edu/us/jaself) ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Hardhats-members mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/hardhats-members
