RE: problems with too many ssl_read and ssl_write errors

2021-08-19 Thread Michael Wojcik
> From: openssl-users  On Behalf Of David 
> Bowers via openssl-users
> Sent: Wednesday, 18 August, 2021 16:38

I don't think this is OpenSSL-related, but at this point it's not clear what 
the issue is.

> . After maybe a few hours/days we see the clients dropping connections.  The 
> logs
> indicate the SSL_Read or SSL_Write on the Server fails for a client with 
> SSL_Error
> number 5 (SSL_ERROR_SYSCALL) and the equivalent Windows error of WSATimeOut.  
> We
> then observe the WSAECONNRESET as the Client closed connection.  We see this
> behavior for multiple sites.

I assume this is a Server-edition version of Windows and you're not trying to 
support that kind of connection load on a desktop edition.

What's set in the Registry under 
HKLM\SYSTEM\CurrentControlSet\Services\TCPIP\Parameters? In particular I'd be 
suspicious of SynAttackProtect and NetworkThrottlingIndex (which shouldn't be 
set on Server, but you never know).

Many online references will suggest altering settings that affect the 
ephemeral-port space, such as TcpTimedWaitDelay, but those are irrelevant on 
the server side (since the connection tuples will use the server port, not an 
ephemeral port, for the server side).

Many of the settings under the TCPIP/Performance key are undocumented. This 
page describes a number of them:

https://forums.alliedmods.net/showpost.php?s=5fedba9ea66557ccea3bfee9e192aaf4=1744400=1

It also discusses a number of netsh commands for TCP/IP tuning.

> . The number of Clients disconnected starts increasing and we see the logs in 
> the
> Client where the server refuses any more connections form Clients (10061-
> WSAECONNREFUSED) There is nothing to indicate this state in the server logs. 
> Our
> theory is the backlog is filled and Server refusing further connections. 

That's possible. Windows, unlike BSD-based stacks, sends an RST when the listen 
queue is full. (BSD-based stacks simply discard the inbound SYN, which is a 
better choice for a number of reasons. Windows did this wrong and stubbornly 
refuses to change.)

You say you're specifying a backlog of 500 in the call to listen(). Microsoft 
recommends just passing SOMAXCONN and letting the provider set a "suitable" 
value. Worth trying.

But this appears to be a secondary issue. The primary one seems to be that for 
whatever reason you get an increasing number of conversation failures, and then 
the client's aggressive retry behavior means you get a cascade of connection 
flooding until the listen queues are full. The clients ought to be changed to 
use random backoff or another strategy that avoids flooding the server, but at 
this point that seems to be addressing a symptom rather than the underlying 
problem.

> . We are trying to find why we get the SSL_Read/SSL_Write Error as it a 
> Blocking
> socket. We cannot use to a non-blocking socket due to platform and application
> limitation

You said you're specifically getting SSL_ERROR_SYSCALL from SSL_read and 
SSL_write. That has nothing to do with whether the socket is in blocking mode 
-- system calls on blocking sockets can certainly return errors. I don't 
understand this question.

There are any number of reasons why the server's ability to handle this load 
might be compromised. Network congestion, bufferbloat, load on the CPU or NIC 
(particularly if TCP offload is enabled to the NIC), contention for DMA, other 
application I/O,  Years ago, I had one customer who had similar problems 
which turned out to be due to intermittent failures in a bad DRAM module in the 
server. Distributed computing is inherently fragile.

But in my experience, this sort of problem is most often due to one or more of:

- Application-logic errors or design issues. Are you multiplexing all these 
blocking sockets, or running a thread per conversation, or something else?

- Middlebox problems. Routers, load balancers, firewall appliances, and so 
forth frequently cause issues.

- Application firewalls and other "anti-malware" software (much of which is 
rubbish) running on the server.

WSAETIMEDOUT on a send operation, assuming OpenSSL didn't need to do a receive 
under the covers for TLS-protocol reasons, could mean that a client app isn't 
doing its receives and consequently its receive window has filled; or it could 
mean that something is interfering with the delivery of network traffic in one 
direction or the other.

WSAETIMEDOUT on a receive, though, again assuming OpenSSL didn't need to send 
under the covers, implies that something set a receive timeout on the socket, 
or that a keepalive wasn't responded to in the required time. Are you setting a 
receive timeout (typically with SO_RCVTIMEO)? Are you setting SO_KEEPALIVE? 
What about SO_KEEPALIVE_VALS? If you're not setting SO_KEEPALIVE_VALS, what are 
KeepAliveTime and KeepAliveInterval set to in the Registry? (See the MSDN docs 
for SO_KEEPALIVE.)

Has the system administrator analyzed the Windows event logs and the network 
statistics? Has 

Re: IMPLEMENT_ASN1_FUNCTIONS tutorial or help

2021-08-19 Thread Ken Goldman

On 8/17/2021 9:47 PM, Sands, Daniel via openssl-users wrote:

The dump you show below is:
Attributes (set, tagged with a 0, optional)
Version
privateKeyAlgorithm
privateKey

This is a PKCS#8 packet for a key.  The encapsulated data is the RSA public key 
in PKCS1 format.  I know OpenSSL has built-in PKCS#8 capability, though I do 
note that the optional attribute set is out of sequence.

Either way, you could look at the PKCS8 source code and simply move the 
attribute to the beginning and otherwise duplicate the ASN1 parts and structure 
there, even if OpenSSL fails to parse this not-quite-spec packet.


For the record, it was an inconsistency - ASN1_SIMPLE requires a pointer, 
ASN1_EMBED does not.

I used the example in x_x509.c, which uses EMBED, but I could not find the 
corresponding typedef.

(I have no opportunity to change the input.  It comes from a standard HSM.)



Re: libcrypto.dylib, building for macOS-arm64 but attempting to link with file built for macOS-x86_64

2021-08-19 Thread Jakob Bohm via openssl-users
This is a known deficiency of how Apple rushed adding support for their 
M1 ARM Desktop CPU to the XCode build suite.


When building a desktop macOs program file with XCode, it will 
implicitly try to build both an x86_64 and an arm64 variant, and will 
get very surprised your compiled open source library files contain only 
code for one of those architectures.


A potential workaround is to build OpenSSL for all 3 macOs desktop 
architectures and combine them with the "lipo" tool to create a "fat 
library".  The major shortcoming of this is that they use the same 
architecture "key" value for macOs and iOs, but require the libraries to 
be compiled differently (because the ABI for some system functions 
differ).  Before the M1 merge, this would work thanks to arm64 always 
being iOS hardware and x86_64 always being macOs hardware.  A workaround 
for that is to create a separate set of library files for iOS (including 
iOS emulator on x86-family desktops) and macOs (including x86-family and 
arm64 architectures), then being careful to reference the right set for 
each project.  Unfortunately, there is no workaround to use the same set 
of library files for iOS on arm64 mobile hardware and iOS emulator on 
arm64 desktop hardware.


On 2021-08-18 07:40, Stephen Dominic Liang wrote:
Hi. I installed OpenSSL 1.1 (MacOS 11.3.1) using Homebrew. I added the 
following to my .bash_profile:


export PKG_CONFIG_PATH="/opt/homebrew/opt/openssl@1.1/lib/pkgconfig"

I added this to the global path file at /etc/paths:

/opt/homebrew/opt/openssl@1.1/bin

I've tried a number of other steps. What are some other 
fixes/suggestions for debugging this issue?


Errors:

|/Applications/CLion.app/Contents/bin/cmake/mac/bin/cmake --build 
/Users/stephenjje/Documents/Je/test/cmake-build-debug --target test -- 
-j 6 [ 2%] Linking C executable test ld: warning: ignoring file 
/usr/local/Cellar/openssl@1.1/1.1.1k/lib/libcrypto.dylib, building for 
macOS-arm64 but attempting to link with file built for macOS-x86_64 
ld: warning: ignoring file 
/usr/local/Cellar/openssl@1.1/1.1.1k/lib/libssl.dylib, building for 
macOS-arm64 but attempting to link with file built for macOS-x86_64 
Undefined symbols for architecture arm64: "_ERR_print_errors_fp", 
referenced from: _http_tcpip_inbound_initialize in 
http_tcpip_inbound.c.o _http_tcpip_inbound_tls_initialize in 
http_tcpip_inbound.c.o 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_OPENSSL_init_crypto", referenced from: 
_http_tcpip_inbound_tls_initialize in http_tcpip_inbound.c.o 
"_OPENSSL_init_ssl", referenced from: 
_http_tcpip_inbound_tls_initialize in http_tcpip_inbound.c.o 
"_SSL_CIPHER_get_name", referenced from: 
_http_tcpip_inbound_initialize in http_tcpip_inbound.c.o 
"_SSL_CTX_free", referenced from: _http_tcpip_inbound_initialize in 
http_tcpip_inbound.c.o "_SSL_CTX_new", referenced from: 
_http_tcpip_inbound_tls_initialize in http_tcpip_inbound.c.o 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_SSL_CTX_use_PrivateKey_file", referenced 
from: _http_tcpip_inbound_tls_initialize in http_tcpip_inbound.c.o 
"_SSL_CTX_use_certificate_file", referenced from: 
_http_tcpip_inbound_tls_initialize in http_tcpip_inbound.c.o 
"_SSL_accept", referenced from: _http_tcpip_inbound_initialize in 
http_tcpip_inbound.c.o "_SSL_connect", referenced from: 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_SSL_ctrl", referenced from: 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_SSL_free", referenced from: 
_http_tcpip_inbound_initialize in http_tcpip_inbound.c.o 
"_SSL_get_current_cipher", referenced from: 
_http_tcpip_inbound_initialize in http_tcpip_inbound.c.o 
"_SSL_get_peer_certificate", referenced from: 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_SSL_new", referenced from: 
_http_tcpip_inbound_initialize in http_tcpip_inbound.c.o 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_SSL_read", referenced from: 
_http_tcpip_inbound_parse_request in http_tcpip_inbound.c.o 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_SSL_set_fd", referenced from: 
_http_tcpip_inbound_initialize in http_tcpip_inbound.c.o 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_SSL_shutdown", referenced from: 
_http_tcpip_inbound_initialize in http_tcpip_inbound.c.o "_SSL_write", 
referenced from: _http_tcpip_inbound_send_response in 
http_tcpip_inbound.c.o _http_tcpip_outbound_request_send_type_tls in 
http_tcpip_outbound.c.o "_TLS_client_method", referenced from: 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o "_TLS_server_method", referenced from: 
_http_tcpip_inbound_tls_initialize in http_tcpip_inbound.c.o 
"_X509_free", referenced from: 
_http_tcpip_outbound_get_url_using_string_type_tls in 
http_tcpip_outbound.c.o 

Re: Crash seen in "OPENSSL_sk_pop_free" API

2021-08-19 Thread Viktor Dukhovni
On Thu, Aug 19, 2021 at 05:59:30AM +, Bala Duvvuri wrote:

> We invoke X509_verify_cert() during the certification verification and
> this fails (expectedly due to the missing CA certificate), so we
> invoke X509_STORE_CTX_free to clean up the "X509_STORE_CTX" context
> and hit this crash (this is not seen always)
> 
> X509_STORE_new()
> X509_STORE_CTX_new()
> X509_STORE_set_verify_cb_func

What does your callback do?

> X509_STORE_set_default_paths
> X509_STORE_load_locations
> X509_STORE_CTX_init
> X509_STORE_CTX_set_flags
> X509_verify_cert > Fails with error 
> X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY as CA certificate is not present.
> 
> /* Cleanup. */
> X509_STORE_CTX_free(pContext); >Crash seen here in 
> sk_X509_pop_free
> 
> 360 void OPENSSL_sk_pop_free(OPENSSL_STACK *st, OPENSSL_sk_freefunc func)
> 361 {
> 362 int i;
> 363
> 364 if (st == NULL)
> 365 return;
> 366 for (i = 0; i < st->num; i++)
> 367 if (st->data[i] != NULL)-> Crash seen here

If the backing array for stack points at invalid memory, then something
has already freed the stack.

Which OpenSSL versions exhibit this issue?  Have you tried other (older
or newer) versions of OpenSSL to determine whether there's an OpenSSL
regression or more likely a bug in your code?

-- 
Viktor.