I recently spent a bunch of time with a customer who was having trouble connecting to our appliance from z/OS. They were getting error 410, "SSL message format is incorrect". curl was failing too, and it doesn't even use System SSL.
After much tinkering, looking at PCAPs, tracing on z/OS, etc., someone said something about AT-TLS. "Wait, what? There's no AT-TLS involved here." "Yes there is, we have it on all connections." Well, there's yer problem, Vern-our product on z/OS was setting up an https connection using GSK (System SSL), or curl was using OpenSSL. Those requests would start their way out to the network, and then AT-TLS would grab them and start its own negotiation. So what we'd see in Wireshark was approximately: 1. Mainframe starts handshake 2. Server (actually a gateway, but that doesn't matter) does its handshake thing 3. Certificates, ciphers, keys exchanged 4. Mainframe says 410 and drops connection Since this of course worked fine for us, we were baffled until we realized AT-TLS was involved: z/OS sent out a Client Hello, and then AT-TLS got in there and the response from the gateway was NOT the expected Server Hello! In retrospect, the fact that curl was also failing MIGHT have been a clue, but at the time we took it as evidence that the problem was outside of z/OS. Instead, it appears the sequence was: product<=>GSK<=>PAGENT<=>AT-TLS<=>TCP/IP<=>network<=>gateway and curl<=>OpenSSL<=>PAGENT<=>AT-TLS<=>TCP/IP<=>network<=>gateway AT-TLS is cool, but not when you didn't ask for it. I had assumed that it was integrated into GSK and/or TCP/IP such that this scenario would be impossible. If it were, then presumably a gsk_environment_init() would keep AT-TLS from kicking in, or cause a meaningful error. Not blaming IBM-this is a user error, and I made an assumption that, while plausible, just isn't correct. The 410 "SSL message format is incorrect" was baffling; even IBM Level 2 was stymied, since they didn't know about the stacked protocols. And apparently whatever tracing they got from the customer didn't show it. This again makes sense, since only one layer of TLS ever actually got established. I wonder whether a sharp eye might find two gsk_environment_init() calls for one connection, but can hardly blame them, since that isn't anywhere near where the error was reported! So why am I telling you this? In case it helps someone else save some tsuris. Googling 410 "SSL message format is incorrect" only gets 13 hits; add "AT-TLS" and that drops to 5, none of which are about this "stacking" issue. So this is not a commonly encountered problem. ...phsiii ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
