Frustrating, no doubt. But you've stumbled on a goodness in USS and OpenVM.
The nice thing about "EBCDIC text stream" (for lack of a better term)
is that 0x15 is non-printable in both ASCII and EBCDIC. Same for
0x0A. I wrote some crude C years ago to scan a "line" up to either
0x15 or 0x0A and translate accordingly: 0x15 means EBCDIC and 0x0A
means ASCII. See below.
It's true that the ASCII 0x0A translates to EBCDIC 0x25, but that is
to your advantage in the long term. The pain you encounter is that
the file(s) got translated literally rather than logically. If you
transfer the EBCDIC byte-stream "text files" (USS text files) in
binary and then convert them locally (on Linux) you can CONSISTENTLY
get them right. You get this wonderful effect that plain text of
either flavor is automagically detected. Nice!
The best of my own old junk that I can find at the moment is ...
http://www.casita.net/pub/uft/getline.c
This is not the most efficient routine (byte at a time!) but it works.
It juses aecs.c (and aecs.h) for the translation and looks for the
requisite 0x15 or 0x0A to know if a given input line is to be
translated ... or not. Another handy thing it does is remove trailing
CRs.
You are also welcome to maketext.c, which uses the aforementioned C
chunks, but I don't like it. I wrote it to convert un-TARred files
imported to OpenVM from Unix. It's just not very elegant and not very
robust. (Should work okay as a filter, if I recall.) It has been
months or years since I have touched this. Lately, I leave
translation to things like the CMS NFS client and go with all or
nothing.
Please excuse that the directory where this stuff lives is fronted by
an index.html.
-- R; <><
On Fri, Nov 20, 2009 at 07:15, John McKown <[email protected]> wrote:
> I'm trying to figure out a better way to do this. Given: I have some
> EBCDIC text files on Linux. Instead of being delimited with 0x25
> (NewLine) instead of 0x15 (linefeed). This is how z/OS UNIX does text
> files and I cannot change it. When run through iconv the 0x25 is
> translated to 0x85, but I need 0x0a. I end up doing a "tr '\205' '\n' in
> order to end up with 0x0a in the ASCII file. Note that when I do an
> ASCII ftp from z/OS, it is doing the conversion "correctly" (defined as
> what I want). However, I don't want to do this. The file being
> transferred is a PAX file which contains true binary as well as text
> files. So what I'm really doing is like (after using tar to unwind the
> PAX file):
>
> for i in *;do file "$i" | fgrep -q EBCDIC && cnv "$i";done
>
> (yes, the fgrep actually does select the files that I want without, so
> far, any false positives or negatives)
>
> where cnv is a shell script:
>
> #!/bin/sh
> iconv -f ISO8859-1 -t ISO8859-1 "$1" |\
> tr '\205' '\n' >"$1.new"
> chmod --reference="$1" "$1.new"
> mv "$1" "$1.org"
> mv "$1.new" "$1"
>
> This works but is not, IMO, "elegant". Any better ideas how to do this?
>
> Many thanks!
>
> --
> John McKown
> Maranatha! <><
>
> ----------------------------------------------------------------------
> For LINUX-390 subscribe / signoff / archive access instructions,
> send email to [email protected] with the message: INFO LINUX-390 or visit
> http://www.marist.edu/htbin/wlvindex?LINUX-390
>
----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390