I'm upgrading an Apache HTTP server of our SVN repos on RedHat Enterprise Linux 
8.  Using Subversion 1.14.1, svn checkout of even a small, simple repo with 
about 150 files hangs about 90% of the time, crashes 5%, and succeeds 5%.  
Given enough time, the hangs eventually time out after checking out much of the 
repo.  A debugger shows the following stack during the hang.

      #0  epoll_wait                   /usr/lib64/libc.so.6
      #1  impl_pollset_poll            /usr/lib64/libapr-1.so.0
      #2  serf_context_run             /usr/lib64/libserf-1.so.0
      #3  svn_ra_serf.context_run      /usr/lib64/libsvn_ra_serf-1.so.0
      #4  finish_report                /usr/lib64/libsvn_ra_serf-1.so.0
      #5  svn_wc_crawl_revisions5      /usr/lib64/libsvn_wc-1.so.0
      #6  update_internal.isra         /usr/lib64/libsvn_client-1.so.0
      #7  svn_client.update_internal   /usr/lib64/libsvn_client-1.so.0
      #8  svn_client.checkout_internal /usr/lib64/libsvn_client-1.so.0
      #9  svn_client_checkout3         /usr/lib64/libsvn_client-1.so.0
      #10 svn_cl.checkout
      #11 sub_main
      #12 main

strace shows repeated calls to epoll_wait about 1 sec apart.

When the checkout crashes, it's a SIGSEGV with this stack,

      #0  apr_pool_create_ex            (libapr-1.so.0)
      #1  svn_pool_create_ex            (libsvn_subr-1.so.0)
      #2  update_opened                 (libsvn_ra_serf-1.so.0)
      #3  expat_start                   (libsvn_ra_serf-1.so.0)
      #4  expat_start_handler           (libsvn_subr-1.so.0)
      #5  doContent                     (libexpat.so.1)
      #6  contentProcessor              (libexpat.so.1)
      #7  XML_ParseBuffer               (libexpat.so.1)
      #8  svn_xml_parse                 (libsvn_subr-1.so.0)
     #9  expat_response_handler        (libsvn_ra_serf-1.so.0)
      #10 process_buffer.isra.9         (libsvn_ra_serf-1.so.0)
      #11 finish_report                 (libsvn_ra_serf-1.so.0)
      #12 svn_wc_crawl_revisions5       (libsvn_wc-1.so.0)
      #13 update_internal.isra.0        (libsvn_client-1.so.0)
      #14 svn_client__update_internal   (libsvn_client-1.so.0)
      #15 svn_client__checkout_internal (libsvn_client-1.so.0)
      #16 svn_client_checkout3          (libsvn_client-1.so.0)
      #17 svn_cl__checkout              (svn)
      #18 sub_main                      (svn)
      #19 main                          (svn)
      #20 __libc_start_main             (libc.so.6)
      #21 _start                        (svn)

or this one,

      #0  apr_allocator_alloc          (libapr-1.so.0)
      #1  serf_bucket_mem_alloc        (libserf-1.so.0)
      #2  serf_bucket_response_create  (libserf-1.so.0)
      #3  serf.process_connection      (libserf-1.so.0)
      #4  serf_event_trigger           (libserf-1.so.0)
      #5  serf_context_run             (libserf-1.so.0)
      #6  svn_ra_serf.context_run      (libsvn_ra_serf-1.so.0)
      #7  finish_report                (libsvn_ra_serf-1.so.0)
      #8  svn_wc_crawl_revisions5      (libsvn_wc-1.so.0)
      #9  update_internal.isra         (libsvn_client-1.so.0)
      #10 svn_client.update_internal   (libsvn_client-1.so.0)
      #11 svn_client.checkout_internal (libsvn_client-1.so.0)
      #12 svn_client_checkout3         (libsvn_client-1.so.0)
      #13 svn_cl.checkout              (svn)
      #14 sub_main                     (svn)
      #15 main                         (svn)

After a failure, I'm left with a half-checked out working copy with many locks. 
 I can complete it with svn cleanup and another svn checkout, but that's not 
realistic for our CI/CD or general use.  Server logs show no indication of a 
problem; the server appears healthy.

I've tried a million things before submitting this bug report, read half a 
million posts and searches, but haven't been able to get past this.  I'd sure 
appreciate any ideas you have on the way forward.  Here's a bit more about my 
system.


  *   svn, version 1.14.1 (r1886195)
     * ra_svn : Module for accessing a repository using the svn network 
protocol.
      - with Cyrus SASL authentication
      - handles 'svn' scheme
     * ra_local : Module for accessing a repository on local disk.
      - handles 'file' scheme
     * ra_serf : Module for accessing a repository via WebDAV protocol using 
serf.
      - using serf 1.3.9 (compiled with 1.3.9)
      - handles 'http' scheme
      - handles 'https' scheme

     The following authentication credential caches are available:

     * Plaintext cache in /home/me/.subversion
     * Gnome Keyring
     * GPG-Agent

  *   svn 1.10.2 was failing the same way before we upgraded to 1.14.1 as a 
possible fix.
  *   Checking out to a local disk succeeds more often, but still hangs and 
crashes.  Checking out to an NFS drive just makes it worse.

And here's more about our Apache.


  *   Server version: Apache/2.4.37 (Red Hat Enterprise Linux)

Server built:   Aug 30 2023 11:01:53

Server's Module Magic Number: 20120211:83

Server loaded:  APR 1.6.3, APR-UTIL 1.6.1

Compiled using: APR 1.6.3, APR-UTIL 1.6.1

Architecture:   64-bit

Server MPM:     worker

  threaded:     yes (fixed thread count)

    forked:     yes (variable process count)

Server compiled with....

-D APR_HAS_SENDFILE

-D APR_HAS_MMAP

-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)

-D APR_USE_SYSVSEM_SERIALIZE

-D APR_USE_PTHREAD_SERIALIZE

-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT

-D APR_HAS_OTHER_CHILD

-D AP_HAVE_RELIABLE_PIPED_LOGS

-D DYNAMIC_MODULE_LIMIT=256

-D HTTPD_ROOT="/etc/httpd"

-D SUEXEC_BIN="/usr/sbin/suexec"

-D DEFAULT_PIDLOG="run/httpd.pid"

-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"

-D DEFAULT_ERRORLOG="logs/error_log"

-D AP_TYPES_CONFIG_FILE="conf/mime.types"

-D SERVER_CONFIG_FILE="conf/httpd.conf"

  *   Access to this server from a browser with SVN and ViewVC pages seems to 
work.
  *   Authentication is over Kerberos with mod_auth_gssapi.
  *   Authorization uses AuthzSVNAccessFile and an access file.
  *   SSL is used with SSLCryptoDevice set to builtin, based on 
this<https://lists.apache.org/thread/zysfq4cb0jkz59p0wkhfm49xwr8lj5to>.
  *   I've tried all three MPMs with no change, based on another post:  
prefork, worker, and event.
  *   We've had Apache running on RedHat 6 with these repos for many years.

I'd be glad to provide additional details or run more tests.  Thanks for any 
ideas you have, and for supporting this software.

Jim
  • svn c... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
    • ... Nico Kadel-Garcia
      • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
        • ... Daniel Sahlberg
          • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
        • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
          • ... Johan Corveleyn
            • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
              • ... Daniel Sahlberg

Reply via email to