On Mar 6, 2010, at 10:10, Richard Troth wrote: > > > The quickest trigger for me, w/r/t which EBCDIC code page was used, is > where the square brackets land. If I see AD and BD, then it's > probably 1047. If I see BA and BB, then it's a good guess that it's > CP 37. But most of my world involves C source or other things rich in > square brackets. YMMV. Scanning for "not" is another helpful hint. > That sounds like a good technique for C source code. And one can confirm one's guess of "C" by looking for strings such as "#include", "int", and "/*". What Rexx characters are code-page sensitive?
Similarly, one might recognize Rexx by the "/* Rexx */" initial comment, "DO", "END", etc. Are you in a place where the CMS filetype is no help, such as a pipelines data stream? Other languages? Do you want to ignore comments, which skew the statistics? UTF-8 rules! Is there PIPE XLATE FROM 1047 to UTF-8? (Available in z/OS iconv().) But is your mail agent RFC 1652 savvy? -- gil
