Mano wrote:
>Thanks Jim.... Progress !
>
>But, I am not sure that this is the way to go.
>The first byte is always 206
>the second byte is between 145 to 169 for uppercase greek characters.

How are the UTF-8 Unicode characters getting to your GT.M input? Are you typing 
them
directly into a terminal emulator or pasting them in from another window or 
what? Can you
as easily set up your workstation for ISO-8859-7?

I was thinking that the ISO-8859-7 encoding might work better for you because 
that is a
one character per byte encoding and with the setup of pattern codes and 
collation for
Greek characters, you might be able to get along without having to make 
extensive changes
to input transforms and such.

I expect that UTF-8 or UTF-16 might work better for you overall if you need to 
work with
more than just the two languages, Greek and English, but it will undoubtedly 
require more
extensive modifications to different parts of VistA. For instance, if each 
Greek character
is represented with two bytes, the length limit on many data fields will need 
to be doubled.

I don't know Greek beyond being able to recognize many of the letters of the 
alphabet and
I haven't had occasion to work much outside of English so I don't really know 
what all the
complications might be. It might be helpful to get some input here from our 
Finnish or
German friends.

>I went through and and changed my pattern codes.
>It works, no errors... but...
>I didn't get it to work exactly yet... but I am closer.
>
>I am not exactly sure if this is the right road though.
>I notice that GTM acts funny when I am using the Greek characters
>(unicode).  For instance deleting and changing characters via the line
>editor will often not work properly.. The cursor often gets messed up.

Yes, UTF-8 as input to a dumb terminal would seem to be problematic. GT.M is 
set up for
one byte per character and each Greek letter encoded with UTF-8 comes across as 
2 bytes.
Characters from some languages would be encoded with 4 bytes. This variability 
of bytes
per character would mess up cursor positioning and character deletion etc in a 
line editor
unless the editor was specifically designed to handle it.

This is much less of a problem for web based applications because the editing 
of input
values is handled in the web browser and the input values are presented to the 
server as
whole units.

>I know mumps doesn't  care if a character is 8 bits or 800 bits.... but
>apparently, parts of mumps do care,

Somebody stated that the MUMPS standard doesn't care. Individual MUMPS 
implementations do
care. Some support 16 bits per character intrinsically, but most support only 8 
bits.

Because of the way that UTF-8 was designed, I think that GT.M and VistA could 
support
UTF-8 pretty well as data, especially for languages such as Greek that only 
require 2-byte
encodings. UTF-8 was designed to get along well as possible with existing 
systems oriented
around 8-bits per character. It avoids control characters in the bytes of 
multi-byte
characters, character strings collate properly on a byte-by-byte basis, and for 
a 2-byte
character, the continuation byte is unique within a given language.

>is there anyway to set character length?
>is this something that needs to be done to GTM?

This is not currently an option. Adding support for Unicode to GT.M is one of 
the
proposals in the GT.M new features survey that Bhaskar recently announced here.
(http://www.sanchez-gtm.com/survey/gtm_survey_05.asp)

>Or... as usuall.... am I just missing the boat.
>
>Manolis
>
>On Fri, 2005-04-08 at 23:58 -0700, Jim Self wrote:
>> I copied your pattern file to "test.pattern" without the comment at the 
>> bottom and added a
>> comment at the top to remind me what it is next time I see it. Then I loaded 
>> it with no
>> problem and tested it:
>>
>>  view "PATLOAD":"test.pattern","PATCODE":"GREEK"
>>  f i=1:1:255 s c=$c(i) w !,i,?5,c?1a,c?1u,c?1l,c?1n,c?1p,c?1c,?15,c
>>
>> That gave me an error "Current pattern table has no characters with pattern 
>> code P", which
>> suggests that you need to define PATCODE C as well.
>>
>> I put in some entries for codes C and P and the test ran to completion.
>>
>> Set up your environment variables after debugging your pattern file and use 
>> the view
>> commands until then.
>>
>> Mano wrote:
>> >I decided to try the pattable stuff that I was told about
>> >I read from pages 40 to 43 in the GTM unix programmers guide and tried
>> >to set up my own pattern table
>> >
>> >I set up my environment variables
>> >gtm_pattern_file=/blah/blah/pattern (which is the pattern file)
>> >gtm_pattern_table=GREEK
>> >
>> >then I have the following as my pattern file
>> >---------------------------------------------------------
>> >PATSTART
>> >        PATTABLE GREEK
>> >         PATCODE L
>> >           97,98,99, -
>> >           100,101,102,103,104,105,106,107,108,109, -
>> >           110,111,112,113,114,115,116,117,118,119, -
>> >           120,121,122, -
>> >           219, -
>> >           220,221,222,223,224,225,226,227,228,229, -
>> >           230,231,232,233,234,235,236,237,238,239, -
>> >           240,241,242,243,244,245,246,247,248,249, -
>> >           250,251,252,253,254
>> >         PATCODE N
>> >           48,49,50,51,52,53,54,55,56,57
>> >         PATCODE U
>> >           65,66,67,68,69, -
>> >           70,71,72,73,74,75,76,77,78,79, -
>> >           80,81,82,83,84,85,86,87,88,89,90, -
>> >           181,182,184,185,186,188,191, -
>> >           193,194,195,196,197,198,199, -
>> >           200,201,202,203,204,205,206,207,208,209, -
>> >           210,211,212,213,214,215,216,217,218
>> >
>> >PATEND
>> >;it says that there is a syntax error on PATEND
>> >---------------------------------------------------------
>> >I run vista
>> >I enter VIEW "PATLOAD":"pattern" and get the following error
>> >
>> >GTM>VIEW "PATLOAD":"pattern"Cannot load table GREEK twice
>> >
>> >%GTM-E-PATTABSYNTAX, Error in pattern at line 24
>> >
>> >GTM>VIEW "PATCODE":"GREEK"
>> >
>> >GTM>
>> >----------------------------------------------------
>> >
>> >
>> >I am sure it is something simple but my eyes have gone all blurry.
>> >
>> >Thanks!

---------------------------------------
Jim Self
Systems Architect, Lead Developer
VMTH Computer Services, UC Davis
(http://www.vmth.ucdavis.edu/us/jaself)


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Hardhats-members mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/hardhats-members

Reply via email to