Sometimes it's outright prohibitted, e.g., RFC 8259: "Implementations MUST NOT 
add a byte order mark (U+FEFF) to the   beginning of a networked-transmitted 
JSON text.  In the interests of   interoperability, implementations that parse 
JSON texts MAY ignore   the presence of a byte order mark rather than treating 
it as an   error."

Further, IETF is moving in the direction of protocols in which UTF-8 is 
mandatory, and RFC 3629, section 6.  Byte order mark (BOM), states

   In the meantime, the uncertainty unfortunately remains and may affect
   Internet protocols.  Protocol specifications MAY restrict usage of
   U+FEFF as a signature in order to reduce or eliminate the potential
   ill effects of this uncertainty.  In the interest of striking a
   balance between the advantages (reduction of uncertainty) and
   drawbacks (loss of the signature function) of such restrictions, it
   is useful to distinguish a few cases:

   o  A protocol SHOULD forbid use of U+FEFF as a signature for those
      textual protocol elements that the protocol mandates to be always
      UTF-8, the signature function being totally useless in those
      cases.

   o  A protocol SHOULD also forbid use of U+FEFF as a signature for
      those textual protocol elements for which the protocol provides
      character encoding identification mechanisms, when it is expected
      that implementations of the protocol will be in a position to
      always use the mechanisms properly.  This will be the case when
      the protocol elements are maintained tightly under the control of
      the implementation from the time of their creation to the time of
      their (properly labeled) transmission.



--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Discussion List [[email protected]] on behalf of 
Paul Gilmartin [[email protected]]
Sent: Tuesday, July 27, 2021 7:05 PM
To: [email protected]
Subject: Re: FTP distributed system EBCDIC encoded file

On Tue, 27 Jul 2021 17:01:56 -0500, Frank Swarbrick wrote:

>We have a vendor that is providing a file that is EBCDIC (IBM-1140) encoded, 
>but also includes an NL record/line terminator.  The source system is NOT a 
>mainframe system.  I'm trying to figure out how to FTP the file to the 
>mainframe and have it treat NL as, well, NL; i.e. a record terminator.  Binary 
>mode (no SITE options) doesn't work because it stores the NL characters.  
>ASCII mode (no SITE options) doesn't work, I believe because it still expects 
>the CRLF delimiter.  I tried specifying "SITE TYPE E" (EBCDIC) and that also 
>does not eliminate the NL delimiter.
>
>Any thoughts?  We're seeing if the vendor can just not use a delimiter at all, 
>but no luck yet.
>
Doesn't z/OS use NL as its line separator?  Verify/refute this with:
    echo 'foo
bar' | od -tx1

I'd expect you to see:
    0000000 86 96 96 15 82 81 99 25
    0000010

where the x'15' is the NL.  I expect transfer in binary to preserve the NL and 
simply work.

>Note: They can create it in UTF-8, but they are including the UTF-8 Byte Order 
>Mark (BOM).  I am able to get z/OS to strip the BOM, but I have to specify the 
>transmission as being "multi-byte", so the destination has to be VB.  Which we 
>can deal with, but we'd prefer FB as that is how we have it from the old 
>vendor.
>
Use of a BOM with UTF-8 is generally deprecated.

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to