On Mar 6, 2010, at 10:10, Richard Troth wrote:
>
>
> The quickest trigger for me, w/r/t which EBCDIC code page was used, is
> where the square brackets land.  If I see AD and BD, then it's
> probably 1047.  If I see BA and BB, then it's a good guess that it's
> CP 37.  But most of my world involves C source or other things rich in
> square brackets.  YMMV.  Scanning for "not" is another helpful hint.
>
That sounds like a good technique for C source code.  And one
can confirm one's guess of "C" by looking for strings such as
"#include", "int", and "/*".  What Rexx characters are code-page
sensitive?

Similarly, one might recognize Rexx by the "/* Rexx */" initial
comment, "DO", "END", etc.

Are you in a place where the CMS filetype is no help, such as
a pipelines data stream?

Other languages?  Do you want to ignore comments, which skew
the statistics?

UTF-8 rules!  Is there PIPE XLATE FROM 1047 to UTF-8?  (Available
in z/OS iconv().)  But is your mail agent RFC 1652 savvy?

-- gil

Reply via email to