Which characters does it consider invalid? Just <32 and >128?

I use the following: (It's missing a few accents, I need to update it when I 
find some time)
I think I'll add Will's piece to the end to see if I missed any however.

SUBROUTINE REMOVE.IIT.ACCENT(INDATA,OUTDATA)
 *
    UAE=CHAR(140):CHAR(198)
    LAE=CHAR(156):CHAR(230)
    DEGREE=CHAR(176)
 *
    
NULL1=CHAR(128):CHAR(129):CHAR(130):CHAR(131):CHAR(132):CHAR(133):CHAR(134):CHAR(135):CHAR(136):CHAR(137)
    
NULL2=CHAR(139):CHAR(141):CHAR(143):CHAR(144):CHAR(145):CHAR(146):CHAR(147):CHAR(148):CHAR(149):CHAR(150)
    
NULL3=CHAR(151):CHAR(152):CHAR(153):CHAR(155):CHAR(157):CHAR(160):CHAR(161):CHAR(162):CHAR(163):CHAR(164)
    
NULL4=CHAR(165):CHAR(166):CHAR(167):CHAR(168):CHAR(169):CHAR(171):CHAR(172):CHAR(173):CHAR(174):CHAR(175)
    
NULL5=CHAR(177):CHAR(178):CHAR(179):CHAR(180):CHAR(181):CHAR(182):CHAR(183):CHAR(184):CHAR(185):CHAR(187)
    NULL6=CHAR(188):CHAR(189):CHAR(190):CHAR(191)
    NULLX=CHAR(186):CHAR(216):CHAR(222):CHAR(240):CHAR(247):CHAR(248)
 *
    UPA=CHAR(192):CHAR(193):CHAR(194):CHAR(195):CHAR(196):CHAR(197)
    LWA=CHAR(224):CHAR(225):CHAR(226):CHAR(227):CHAR(228):CHAR(229)
    RUA=STR('A',LEN(UPA))
    RLA=STR('a',LEN(LWA))
    UPE=CHAR(200):CHAR(201):CHAR(202):CHAR(203)
    LWE=CHAR(232):CHAR(233):CHAR(234):CHAR(235)
    RUE=STR('E',LEN(UPE))
    RLE=STR('e',LEN(LWE))
    UPI=CHAR(204):CHAR(205):CHAR(206):CHAR(207)
    LWI=CHAR(236):CHAR(237):CHAR(238):CHAR(239)
    RUI=STR('I',LEN(UPI))
    RLI=STR('i',LEN(LWI))
    UPO=CHAR(210):CHAR(211):CHAR(212):CHAR(213):CHAR(214)
    LWO=CHAR(242):CHAR(243):CHAR(244):CHAR(245):CHAR(246)
    RUO=STR('O',LEN(UPO))
    RLO=STR('o',LEN(LWO))
    UPU=CHAR(217):CHAR(218):CHAR(219):CHAR(220)
    LWU=CHAR(249):CHAR(250):CHAR(251):CHAR(252)
    RUU=STR('U',LEN(UPU))
    RLU=STR('u',LEN(LWU))
    
SS1=CHAR(138):CHAR(142):CHAR(154):CHAR(158):CHAR(159):CHAR(199):CHAR(208):CHAR(209):CHAR(221):CHAR(223):CHAR(231)
    SR1="SZszYCDNYBc"
    SS2=CHAR(241):CHAR(253):CHAR(170):CHAR(215)
    SR2="nyax"
*
    EXPR1=UPA:LWA:UPE:LWE:UPI:LWI:UPO:LWO:UPU:LWU:SS1:SS2
    EXPR2=RUA:RLA:RUE:RLE:RUI:RLI:RUO:RLO:RUU:RLU:SR1:SR2
    NULCHARS=NULL1:NULL2:NULL3:NULL4:NULL5:NULL6:NULLX:DEGREE
*
    LIN=INDATA
*
    CONVERT EXPR1 TO EXPR2 IN LIN
    CONVERT NULCHARS TO "" IN LIN
    LIN=CHANGE(LIN,UAE,"AE")
    LIN=CHANGE(LIN,LAE,"ae")
*
    OUTDATA=LIN
    RETURN

> -----Original Message-----
> From: [email protected] [mailto:u2-users-
> [email protected]] On Behalf Of [email protected]
> Sent: Wednesday, January 12, 2011 12:28 PM
> To: [email protected]
> Subject: Re: [U2] Special Character Handling
> 
> 
> 
> I would suggest this
> 
> IF String # Oconv(String,"MCP") then
> 
> End
> 
> This will quite quickly tell you *whether* any given string has a
> non-printable char in it.
> It is the fastest known method to give you this Boolean result.
> 
> Then use the INDEX function to return the absolute location of any "."
> (period) in the string.  If there was not a period at that same
> location in the
> original string, then that char is invalid.
> 
> W
> _______________________________________________
> U2-Users mailing list
> [email protected]
> http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
[email protected]
http://listserver.u2ug.org/mailman/listinfo/u2-users

Reply via email to