I was able to reproduce on that system. I haven't had a chance to put a "test" case together yet. I'll try to do that next week. The key thing I think is that the firewall just drops the connection. If the hostname didn't exist, or if the connection were rejected, I think it would return quicker and likely not cause problems. Since the connection is dropped, you sit there for the entire timeout, then through any retry mechanisms and their timeout.

On 3/24/23 09:55, Misagh Moayyed wrote:
Interesting. Is this something you can reproduce?

On Tuesday, March 14, 2023 at 9:49:38 PM UTC+4 richard.frovarp wrote:

    I'm on CAS 6.6.6. I had a SAML 2 service that was trying to pull
    metadata from a remote URL. This request timed out (discovered a
    firewall in the way). That ended up causing all of my other SAML 2
    services to time out as well. The CAS protocol services were just
    fine.
    In my logs I see:

    2023-03-14 11:02:05,949 ERROR [org.apereo.cas.util.HttpUtils] -
    <Connect
    to hostname failed: connect timed out
            DefaultHttpClientConnectionOperator.java:connect:151
            PoolingHttpClientConnectionManager.java:connect:376
            MainClientExec.java:establishRoute:393
    >
    2023-03-14 11:02:05,949 ERROR
    
[org.apereo.cas.support.saml.services.idp.metadata.cache.resolver.UrlResourceMetadataResolver]

    - <NullPointerException
            UrlResourceMetadataResolver.java:resolve:107
    SamlRegisteredServiceMetadataResolverCacheLoader.java:lambda$load$1:66

            Unchecked.java:lambda$function$21:878


    Since it timed out, there is no status line to get a status code for.
    That caused the NPE. I see this error a few times, so I don't know if
    CAS was doing a retry, or my browser was trying it.

    I also see a different set of errors for the same service:

    2023-03-14 10:44:23,080 ERROR
    
[org.apereo.cas.support.saml.services.idp.metadata.SamlRegisteredServiceServiceProviderMetadataFacade]

    - <No metadata resolvers could be configured for service Grouper
    Devel
    with metadata location path
    SamlRegisteredServiceMetadataResolverCacheLoader.java:load:72
    SamlRegisteredServiceMetadataResolverCacheLoader.java:load:31
            LocalLoadingCache.java:lambda$newMappingFunction$3:197
    >
    2023-03-14 10:44:23,080 WARN
    
[org.apereo.cas.support.saml.web.idp.profile.AbstractSamlIdPProfileHandlerController]

    - <No metadata could be found for [entityId]>
    2023-03-14 10:44:23,080 WARN
    [org.apereo.cas.util.function.FunctionUtils] - <Cannot find metadata
    linked to entityId
    
AbstractSamlIdPProfileHandlerController.java:verifySamlAuthenticationRequest:497

    
AbstractSamlIdPProfileHandlerController.java:initiateAuthenticationRequest:315

    
AbstractSamlIdPProfileHandlerController.java:lambda$handleSsoPostProfileRequest$4:652

    >
    2023-03-14 10:44:23,080 ERROR [org.apereo.cas.web.support.WebUtils] -
    <Cannot find metadata linked to entityId
    
AbstractSamlIdPProfileHandlerController.java:verifySamlAuthenticationRequest:497

    
AbstractSamlIdPProfileHandlerController.java:initiateAuthenticationRequest:315

    
AbstractSamlIdPProfileHandlerController.java:lambda$handleSsoPostProfileRequest$4:652


    While this was happening, my other SAML 2 services also timed out.
    Guessing it has to do with the resolver being synchronized? A timeout
    takes a while to happen, so that would hold up the other services
    that
    were good anyway. The fix was to restart CAS. I don't know if the
    other
    services were failing as this was continuing to try and timeout,
    or if
    the NPE broke things enough. This happened on prod, so we hit the
    restart pretty quickly. It was on a service that I used and I
    caused it,
    so we detected it pretty quickly.

    It now can pull the metadata, so things are fine. However, I'm not
    super
    thrilled with the idea of one service metadata refresh timing out
    killing the rest of my SAML 2 services. I could come up with my own
    caching method into git external to CAS, but as of right now, I'd
    prefer
    if CAS was doing it.

    Richard

--
You received this message because you are subscribed to the Google Groups "CAS Developer" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-dev/fe092d69-51f1-46ce-a04f-b192e2db214en%40apereo.org <https://groups.google.com/a/apereo.org/d/msgid/cas-dev/fe092d69-51f1-46ce-a04f-b192e2db214en%40apereo.org?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups "CAS 
Developer" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/a/apereo.org/d/msgid/cas-dev/ca1d721a-481e-3034-83ed-f4fd907a8a52%40ndsu.edu.

Reply via email to