Hi,

I've added a few comments/questions below.

Kind regards,
Daniel Sahlberg

Den lör 11 maj 2024 kl 03:00 skrev Williams, James P. {Jim} (JSC-CD4)[KBR
Wyle Services, LLC] via users <users@subversion.apache.org>:

> > How did you upgrade your server from RHEL 6 to RHEL 8?
>
> Because so much changed from RHEL 6 to 8, including Apache from 2.2.15 to
> 2.4.37, all the Apache modules, etc., I started from the skeleton
> configuration the operating system provides and made mostly the same
> customizations we had for RHEL 6, or modernized them where the docs said
> things changed.  Mostly, that was tweaks to authentication (from LDAP to
> Kerberos), SSL, and the SVN endpoints.  Browser access to all SVN and
> ViewVC pages seems to work fine.
>

You previously mentioned Subversion 1.14.1, is that on the server or on the
client?

[...]

> And do the problems happen if you use svn:// rather than https:// ?
>
> I thought svn:// worked only with svnserve, which we don't run.  Are you
> suggesting I try to run it as a test, or that I consider abandoning Apache
> in favor of it?  Yikes; that'd be painful.
>
> I hear you on the HTTP integration.  We have about 2000 repos and a few
> hundred developers.  I've supported that server for at least 15 years, and
> it hasn't been too bad...until now.
>

I have personally only ever used Subversion over http/https (except for
testing purposes) and I haven't had any of the problems described by Nico -
I guess YMMV...

Still it would be interesting to compare just to rule out a problem within
the repository. You can use svnserve directly or tunneled over SSH, see the
Subversion book:

https://svnbook.red-bean.com/en/1.7/svn.serverconfig.svnserve.html#svn.serverconfig.svnserve.sshauth



> On Fri, May 10, 2024 at 4:17 PM Williams, James P. {Jim} (JSC-CD4)[KBR
> Wyle Services, LLC] via users <users@subversion.apache.org> wrote:
> >
> > I'm upgrading an Apache HTTP server of our SVN repos on RedHat
> Enterprise Linux 8.  Using Subversion 1.14.1, svn checkout of even a small,
> simple repo with about 150 files hangs about 90% of the time, crashes 5%,
> and succeeds 5%.  Given enough time, the hangs eventually time out after
> checking out much of the repo.  A debugger shows the following stack during
> the hang.
> >
> >
> >
> >       #0  epoll_wait                   /usr/lib64/libc.so.6
>

Waiting for a reply from the server ... ?

Do you see any activity on the server (CPU / disk) during this time?


> >
> >       #1  impl_pollset_poll            /usr/lib64/libapr-1.so.0
> >
> >       #2  serf_context_run             /usr/lib64/libserf-1.so.0
> >
> >       #3  svn_ra_serf.context_run      /usr/lib64/libsvn_ra_serf-1.so.0
> >
> >       #4  finish_report                /usr/lib64/libsvn_ra_serf-1.so.0
> >
> >       #5  svn_wc_crawl_revisions5      /usr/lib64/libsvn_wc-1.so.0
> >
> >       #6  update_internal.isra         /usr/lib64/libsvn_client-1.so.0
> >
> >       #7  svn_client.update_internal   /usr/lib64/libsvn_client-1.so.0
> >
> >       #8  svn_client.checkout_internal /usr/lib64/libsvn_client-1.so.0
> >
> >       #9  svn_client_checkout3         /usr/lib64/libsvn_client-1.so.0
> >
> >       #10 svn_cl.checkout
> >
> >       #11 sub_main
> >
> >       #12 main
> >
> >
> >
> > strace shows repeated calls to epoll_wait about 1 sec apart.
> >
> >
> >
> > When the checkout crashes, it's a SIGSEGV with this stack,
> >
> >
> >
> >       #0  apr_pool_create_ex            (libapr-1.so.0)
>

Memory allocation?


> >
> >       #1  svn_pool_create_ex            (libsvn_subr-1.so.0)
> >
> >       #2  update_opened                 (libsvn_ra_serf-1.so.0)
> >
> >       #3  expat_start                   (libsvn_ra_serf-1.so.0)
>

Parsing the XML message from the server?

Can you catch/view the actual XML message sent from the server? I'm
thinking if this is mangled in some strange way that is upsetting the XML
parser.


> >
> >       #4  expat_start_handler           (libsvn_subr-1.so.0)
> >
> >       #5  doContent                     (libexpat.so.1)
> >
> >       #6  contentProcessor              (libexpat.so.1)
> >
> >       #7  XML_ParseBuffer               (libexpat.so.1)
> >
> >       #8  svn_xml_parse                 (libsvn_subr-1.so.0)
> >
> >       #9  expat_response_handler        (libsvn_ra_serf-1.so.0)
> >
> >       #10 process_buffer.isra.9         (libsvn_ra_serf-1.so.0)
> >
> >       #11 finish_report                 (libsvn_ra_serf-1.so.0)
> >
> >       #12 svn_wc_crawl_revisions5       (libsvn_wc-1.so.0)
> >
> >       #13 update_internal.isra.0        (libsvn_client-1.so.0)
> >
> >       #14 svn_client__update_internal   (libsvn_client-1.so.0)
> >
> >       #15 svn_client__checkout_internal (libsvn_client-1.so.0)
> >
> >       #16 svn_client_checkout3          (libsvn_client-1.so.0)
> >
> >       #17 svn_cl__checkout              (svn)
> >
> >       #18 sub_main                      (svn)
> >
> >       #19 main                          (svn)
> >
> >       #20 __libc_start_main             (libc.so.6)
> >
> >       #21 _start                        (svn)
> >
> >
> >
> > or this one,
> >
> >
> >
> >       #0  apr_allocator_alloc          (libapr-1.so.0)
> >
> >       #1  serf_bucket_mem_alloc        (libserf-1.so.0)
>

Again something with memory allocation - same here, can you see what the
server is actually sending?


> >
> >       #2  serf_bucket_response_create  (libserf-1.so.0)
> >
> >       #3  serf.process_connection      (libserf-1.so.0)
> >
> >       #4  serf_event_trigger           (libserf-1.so.0)
> >
> >       #5  serf_context_run             (libserf-1.so.0)
> >
> >       #6  svn_ra_serf.context_run      (libsvn_ra_serf-1.so.0)
> >
> >       #7  finish_report                (libsvn_ra_serf-1.so.0)
> >
> >       #8  svn_wc_crawl_revisions5      (libsvn_wc-1.so.0)
> >
> >       #9  update_internal.isra         (libsvn_client-1.so.0)
> >
> >       #10 svn_client.update_internal   (libsvn_client-1.so.0)
> >
> >       #11 svn_client.checkout_internal (libsvn_client-1.so.0)
> >
> >       #12 svn_client_checkout3         (libsvn_client-1.so.0)
> >
> >       #13 svn_cl.checkout              (svn)
> >
> >       #14 sub_main                     (svn)
> >
> >       #15 main                         (svn)
> >
> >
> >
> > After a failure, I'm left with a half-checked out working copy with many
> locks.  I can complete it with svn cleanup and another svn checkout, but
> that's not realistic for our CI/CD or general use.  Server logs show no
> indication of a problem; the server appears healthy.
> >
> >
> >
> > I've tried a million things before submitting this bug report, read half
> a million posts and searches, but haven't been able to get past this.  I'd
> sure appreciate any ideas you have on the way forward.  Here's a bit more
> about my system.
> >
> >
> >
> > svn, version 1.14.1 (r1886195)
> >
> >      * ra_svn : Module for accessing a repository using the svn network
> protocol.
> >
> >       - with Cyrus SASL authentication
> >
> >       - handles 'svn' scheme
> >
> >      * ra_local : Module for accessing a repository on local disk.
> >
> >       - handles 'file' scheme
> >
> >      * ra_serf : Module for accessing a repository via WebDAV protocol
> using serf.
> >
> >       - using serf 1.3.9 (compiled with 1.3.9)
> >
> >       - handles 'http' scheme
> >
> >       - handles 'https' scheme
> >
> >
> >
> >      The following authentication credential caches are available:
> >
> >
> >
> >      * Plaintext cache in /home/me/.subversion
> >
> >      * Gnome Keyring
> >
> >      * GPG-Agent
> >
> > svn 1.10.2 was failing the same way before we upgraded to 1.14.1 as a
> possible fix.
> > Checking out to a local disk succeeds more often, but still hangs and
> crashes.  Checking out to an NFS drive just makes it worse.
>

I don't immediately see the call stacks above and the fact that it would
fail more often if the WC is on an NFS drive. Possibly if the NFS drive is
slower and this causes some kind of timeout? Can you create a ramdisk and
have the WC there temporary and see if there is a difference?


> >
> >
> >
> > And here's more about our Apache.
> >
> >
> >
> > Server version: Apache/2.4.37 (Red Hat Enterprise Linux)
> >
> > Server built:   Aug 30 2023 11:01:53
> >
> > Server's Module Magic Number: 20120211:83
> >
> > Server loaded:  APR 1.6.3, APR-UTIL 1.6.1
> >
> > Compiled using: APR 1.6.3, APR-UTIL 1.6.1
> >
> > Architecture:   64-bit
> >
> > Server MPM:     worker
> >
> >   threaded:     yes (fixed thread count)
> >
> >     forked:     yes (variable process count)
> >
> > Server compiled with....
> >
> > -D APR_HAS_SENDFILE
> >
> > -D APR_HAS_MMAP
> >
> > -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
> >
> > -D APR_USE_SYSVSEM_SERIALIZE
> >
> > -D APR_USE_PTHREAD_SERIALIZE
> >
> > -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
> >
> > -D APR_HAS_OTHER_CHILD
> >
> > -D AP_HAVE_RELIABLE_PIPED_LOGS
> >
> > -D DYNAMIC_MODULE_LIMIT=256
> >
> > -D HTTPD_ROOT="/etc/httpd"
> >
> > -D SUEXEC_BIN="/usr/sbin/suexec"
> >
> > -D DEFAULT_PIDLOG="run/httpd.pid"
> >
> > -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
> >
> > -D DEFAULT_ERRORLOG="logs/error_log"
> >
> > -D AP_TYPES_CONFIG_FILE="conf/mime.types"
> >
> > -D SERVER_CONFIG_FILE="conf/httpd.conf"
> >
> > Access to this server from a browser with SVN and ViewVC pages seems to
> work.
> > Authentication is over Kerberos with mod_auth_gssapi.
> > Authorization uses AuthzSVNAccessFile and an access file.
> > SSL is used with SSLCryptoDevice set to builtin, based on this.
> > I've tried all three MPMs with no change, based on another post:
> prefork, worker, and event.
> > We've had Apache running on RedHat 6 with these repos for many years.
> >
> >
> >
> > I'd be glad to provide additional details or run more tests.  Thanks for
> any ideas you have, and for supporting this software.
> >
> >
> >
> > Jim
>
  • svn c... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
    • ... Nico Kadel-Garcia
      • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
        • ... Daniel Sahlberg
          • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
            • ... Johan Corveleyn
              • ... Daniel Sahlberg
        • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
          • ... Johan Corveleyn
            • ... Williams, James P. {Jim} (JSC-CD4)[KBR Wyle Services, LLC] via users
              • ... Daniel Sahlberg

Reply via email to