I'm upgrading an Apache HTTP server of our SVN repos on RedHat Enterprise Linux 8. Using Subversion 1.14.1, svn checkout of even a small, simple repo with about 150 files hangs about 90% of the time, crashes 5%, and succeeds 5%. Given enough time, the hangs eventually time out after checking out much of the repo. A debugger shows the following stack during the hang.
#0 epoll_wait /usr/lib64/libc.so.6 #1 impl_pollset_poll /usr/lib64/libapr-1.so.0 #2 serf_context_run /usr/lib64/libserf-1.so.0 #3 svn_ra_serf.context_run /usr/lib64/libsvn_ra_serf-1.so.0 #4 finish_report /usr/lib64/libsvn_ra_serf-1.so.0 #5 svn_wc_crawl_revisions5 /usr/lib64/libsvn_wc-1.so.0 #6 update_internal.isra /usr/lib64/libsvn_client-1.so.0 #7 svn_client.update_internal /usr/lib64/libsvn_client-1.so.0 #8 svn_client.checkout_internal /usr/lib64/libsvn_client-1.so.0 #9 svn_client_checkout3 /usr/lib64/libsvn_client-1.so.0 #10 svn_cl.checkout #11 sub_main #12 main strace shows repeated calls to epoll_wait about 1 sec apart. When the checkout crashes, it's a SIGSEGV with this stack, #0 apr_pool_create_ex (libapr-1.so.0) #1 svn_pool_create_ex (libsvn_subr-1.so.0) #2 update_opened (libsvn_ra_serf-1.so.0) #3 expat_start (libsvn_ra_serf-1.so.0) #4 expat_start_handler (libsvn_subr-1.so.0) #5 doContent (libexpat.so.1) #6 contentProcessor (libexpat.so.1) #7 XML_ParseBuffer (libexpat.so.1) #8 svn_xml_parse (libsvn_subr-1.so.0) #9 expat_response_handler (libsvn_ra_serf-1.so.0) #10 process_buffer.isra.9 (libsvn_ra_serf-1.so.0) #11 finish_report (libsvn_ra_serf-1.so.0) #12 svn_wc_crawl_revisions5 (libsvn_wc-1.so.0) #13 update_internal.isra.0 (libsvn_client-1.so.0) #14 svn_client__update_internal (libsvn_client-1.so.0) #15 svn_client__checkout_internal (libsvn_client-1.so.0) #16 svn_client_checkout3 (libsvn_client-1.so.0) #17 svn_cl__checkout (svn) #18 sub_main (svn) #19 main (svn) #20 __libc_start_main (libc.so.6) #21 _start (svn) or this one, #0 apr_allocator_alloc (libapr-1.so.0) #1 serf_bucket_mem_alloc (libserf-1.so.0) #2 serf_bucket_response_create (libserf-1.so.0) #3 serf.process_connection (libserf-1.so.0) #4 serf_event_trigger (libserf-1.so.0) #5 serf_context_run (libserf-1.so.0) #6 svn_ra_serf.context_run (libsvn_ra_serf-1.so.0) #7 finish_report (libsvn_ra_serf-1.so.0) #8 svn_wc_crawl_revisions5 (libsvn_wc-1.so.0) #9 update_internal.isra (libsvn_client-1.so.0) #10 svn_client.update_internal (libsvn_client-1.so.0) #11 svn_client.checkout_internal (libsvn_client-1.so.0) #12 svn_client_checkout3 (libsvn_client-1.so.0) #13 svn_cl.checkout (svn) #14 sub_main (svn) #15 main (svn) After a failure, I'm left with a half-checked out working copy with many locks. I can complete it with svn cleanup and another svn checkout, but that's not realistic for our CI/CD or general use. Server logs show no indication of a problem; the server appears healthy. I've tried a million things before submitting this bug report, read half a million posts and searches, but haven't been able to get past this. I'd sure appreciate any ideas you have on the way forward. Here's a bit more about my system. * svn, version 1.14.1 (r1886195) * ra_svn : Module for accessing a repository using the svn network protocol. - with Cyrus SASL authentication - handles 'svn' scheme * ra_local : Module for accessing a repository on local disk. - handles 'file' scheme * ra_serf : Module for accessing a repository via WebDAV protocol using serf. - using serf 1.3.9 (compiled with 1.3.9) - handles 'http' scheme - handles 'https' scheme The following authentication credential caches are available: * Plaintext cache in /home/me/.subversion * Gnome Keyring * GPG-Agent * svn 1.10.2 was failing the same way before we upgraded to 1.14.1 as a possible fix. * Checking out to a local disk succeeds more often, but still hangs and crashes. Checking out to an NFS drive just makes it worse. And here's more about our Apache. * Server version: Apache/2.4.37 (Red Hat Enterprise Linux) Server built: Aug 30 2023 11:01:53 Server's Module Magic Number: 20120211:83 Server loaded: APR 1.6.3, APR-UTIL 1.6.1 Compiled using: APR 1.6.3, APR-UTIL 1.6.1 Architecture: 64-bit Server MPM: worker threaded: yes (fixed thread count) forked: yes (variable process count) Server compiled with.... -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D DYNAMIC_MODULE_LIMIT=256 -D HTTPD_ROOT="/etc/httpd" -D SUEXEC_BIN="/usr/sbin/suexec" -D DEFAULT_PIDLOG="run/httpd.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="conf/mime.types" -D SERVER_CONFIG_FILE="conf/httpd.conf" * Access to this server from a browser with SVN and ViewVC pages seems to work. * Authentication is over Kerberos with mod_auth_gssapi. * Authorization uses AuthzSVNAccessFile and an access file. * SSL is used with SSLCryptoDevice set to builtin, based on this<https://lists.apache.org/thread/zysfq4cb0jkz59p0wkhfm49xwr8lj5to>. * I've tried all three MPMs with no change, based on another post: prefork, worker, and event. * We've had Apache running on RedHat 6 with these repos for many years. I'd be glad to provide additional details or run more tests. Thanks for any ideas you have, and for supporting this software. Jim