I'd love to share anything I have after I solve my two main issues:
* I still have not found a production way around the segfault issue.
Which cause HTTP 502:
[sogo] | 2024-08-13 17:12:05.646 sogod[4:4] PG0x0x639d2261e650
SQL: COMMIT TRANSACTION
[sogo] |
[sogo] | (process:4): Lasso-CRITICAL **: 17:12:05.647:
2024-08-13 17:12:05 (profile.c/:913) Trying to unref a non GObject
pointer file=profile.c:913 pointerbybname=profile->identity
pointer=0x639d2252bd90
[gateway_apache] | 10.255.7.2 - - [13/Aug/2024:17:12:05 +0200] "POST
/SOGo/saml2-signon-post HTTP/1.1" 502 341
The non-production way around this has been:
sed -i 's/lasso_release_gobject(profile->identity);//g'
lasso-2.8.2/lasso/id-ff/profile.c
sed -i 's/lasso_release_gobject(profile->session);//g'
lasso-2.8.2/lasso/id-ff/profile.c
* I'm also still in a weird redirect loop:
... successful login leads to ...
|SOGo| request took 0.023940 seconds to execute
10.255.7.2 - - [14/Aug/2024:10:57:01 +0200] "POST
/SOGo/saml2-signon-post HTTP/1.1" 302 -
10.255.7.2 "POST /SOGo/saml2-signon-post HTTP/1.1" 302 0/5890 0.026 - -
384K - 12
But then gets redirected to /SOGo//anton and not /SOGo/so/anton (I
assume this is what it supposed to be at least)
|SOGo| starting method 'GET' on uri '/SOGo//anton'
|SOGo| traverse(acquire): SOGo => anton
|SOGo| do traverse name: 'SOGo'
|SOGo| do traverse name: 'anton'
|SOGo| traverse miss: name=anton, acquire: i=1,count=2
|SOGo| miss is last object.
|SOGo| handle miss error: <SoAuthRequiredException: 0x5afd47ae9460>
NAME:SoAuthRequired REASON:authentication required
|SOGo| request took 0.006568 seconds to execute
Which leads to another SSO login
10.255.7.2 "GET /SOGo//anton HTTP/1.1" 302 0/0 0.008 - - 0 - 12
10.255.7.2 - - [14/Aug/2024:10:57:01 +0200] "GET /SOGo//anton HTTP/1.1"
302 -
|SOGo| starting method 'POST' on uri '/SOGo/saml2-signon-post'
|SOGo| traverse(acquire): SOGo => saml2-signon-post
|SOGo| do traverse name: 'SOGo'
|SOGo| do traverse name: 'saml2-signon-post'
|SOGo| set clientObject: <SOGo[0x0x5afd47658a90]: name=SOGo>
2024-08-14 10:57:02.028 sogod[1:1] PG0x0x5afd478e4370 SQL: BEGIN TRANSACTION
2024-08-14 10:57:02.029 sogod[1:1] PG0x0x5afd478e4370 SQL: SELECT
t1.c_creationdate, t1.c_id, t1.c_lastseen, t1.c_value FROM
sogo_sessions_folder t1 WHERE t1.c_id='NL...rC'
2024-08-14 10:57:02.029 sogod[1:1] PG0x0x5afd478e4370 SQL: ROLLBACK
TRANSACTION
2024-08-14 10:57:02.030 sogod[1:1] PG0x0x5afd47983090 SQL: BEGIN TRANSACTION
2024-08-14 10:57:02.030 sogod[1:1] PG0x0x5afd47983090 SQL: INSERT INTO
sogo_sessions_folder (c_lastseen, c_creationdate, c_value, c_id) VALUES
(1723625822, 1723625822, '5S8...hcx', 'NL...rC')
2024-08-14 10:57:02.031 sogod[1:1] PG0x0x5afd47983090 SQL: COMMIT
TRANSACTION
|SOGo| request took 0.022581 seconds to execute
10.255.7.2 "POST /SOGo/saml2-signon-post HTTP/1.1" 302 0/5890 0.025 - -
0 - 12
10.255.7.2 - - [14/Aug/2024:10:57:02 +0200] "POST
/SOGo/saml2-signon-post HTTP/1.1" 302 -
... which leads back to the loop scenario ...
|SOGo| starting method 'GET' on uri '/SOGo//anton'
|SOGo| traverse(acquire): SOGo => anton
|SOGo| do traverse name: 'SOGo'
|SOGo| do traverse name: 'anton'
|SOGo| traverse miss: name=anton, acquire: i=1,count=2
|SOGo| miss is last object.
|SOGo| handle miss error: <SoAuthRequiredException: 0x5afd47a828a0>
NAME:SoAuthRequired REASON:authentication required
And this I can't get around.
//Anton
On 8/12/24 4:46 PM, qhivert ([email protected]) wrote:
Hi,
I would be happy to write a better documentation. I will try to make an
environment with keycloak/saml2 using your mails.
* For SOGoSAML2IdpPublicKeyLocation and SOGoSAML2IdpCertificateLocation, do you
add the ----BEGIN PUBLIC KEY---- and -----BEGIN CERTIFICATE---- (and the ---
END X----) to the value given by keycloak? My keycloak only give the direct
value without those tags.
* Could you share your current saml parameters/value of your sogo.conf ?
* Does the mail works? I see you put `NGImap4AuthMechanism = SAML;` and are
using dovecot, but I assume you had to set something else as dovecot don't
support natively this mechanism.
If you think of anything else missing in the doc and have to find by yourself
to make it work, don't hesitate to share.
Quentin
-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of Anton Hvornum
Sent: lundi 12 août 2024 15:28
To: [email protected]
Subject: Re: [SOGo] SOGo with SAML2 against KeyCloak 23+ causes "Tried to add nil
value for key 'login' to dictionary INFO"
Hi!
Thank you for the reply. I guess going with LDAP behind KeyCloak makes more
sense in that case.
I would say this issue remains:
(process:8): Lasso-CRITICAL **: 15:24:02.378: 2024-08-12 15:24:02
(profile.c/:913) Trying to unref a non GObject pointer
file=profile.c:913 pointerbybname=profile->identity pointer=0x60ddab4c7d00
The issue is not super obvious as it happens after the browser has gotten the
response, so normal day-to-day operations probably won't notice it even. But it
fills logs, and it causes exit codes left and right so monitoring software can
get confused.
//Anton
On 8/12/24 9:29 AM, qhivert ([email protected]) wrote:
Hello,
Yes, with saml2 or cas, sogo still need a user source as the sso will only be
used to validate and fetch the mail of the user but for all the rest, like the
cn, it needs a ldap or sql user source.
So, everything is working now, or you still have issues? May I ask how you
configured your imap server?
Quentin
-----Original Message-----
From: [email protected] <[email protected]> On Behalf Of Anton
Hvornum
Sent: samedi 10 août 2024 20:38
To: [email protected]
Subject: Re: [SOGo] SOGo with SAML2 against KeyCloak 23+ causes "Tried to add nil
value for key 'login' to dictionary INFO"
On 8/10/24 18:20, Anton Hvornum wrote:
On 8/10/24 12:09, Anton Hvornum wrote:
On 8/10/24 02:27, Anton Hvornum wrote:
On 8/9/24 22:59, Anton Hvornum ([email protected]) wrote:
I've attempted to get SAML login working using the following guide:
https://bluntlab.space/posts/sogo-saml-keycloak/
Currently I'm running apache, memcached, sogo and postgresql in a
docker compose environment while keycloak, postfix and dovecot are
running externally.
This is an excerpt from sogo.conf:
SOGoCacheCleanupInterval = 3600;
SOGoAuthenticationType = saml2;
NGImap4AuthMechanism = SAML;
SOGoSAML2IdpMetadataLocation = "/etc/sogo/idp-metadata.xml";
SOGoSAML2PrivateKeyLocation = "/etc/sogo/saml.privkey.pem";
SOGoSAML2CertificateLocation = "/etc/sogo/saml.cert.pem"; //
SOGoSAML2IdpPublicKeyLocation = "/etc/sogo/idp.key"; //
SOGoSAML2IdpCertificateLocation = "/etc/sogo/idp.crt";
SOGoSAML2LoginAttribute = "username"; SOGoSAML2LogoutEnabled =
YES; SOGoSAML2LogoutURL = "https://sogo.domain.com";
When visiting https://sogo.domain.com/SOGo i get redirected to the
keycloak realm SSO prompt, credentials are accepted and it
redirects me back to what I configured in KeyCloak to be
"Assertion Consumer Service POST Binding URL":
https://sogo.domain.com:443/SOGo/saml2-signon-post
But once there, I keep hitting:
```
sogod [11]: 192.168.0.10 "GET /SOGo HTTP/1.1" 302 0/0 0.002 - - 0
-
11 sogod [11]: |SOGo| starting method 'POST' on uri
'/SOGo/saml2-signon-post'
sogod [11]: |SOGo| traverse(acquire): SOGo => saml2-signon-post
sogod [11]: |SOGo| do traverse name: 'SOGo'
sogod [11]: |SOGo| do traverse name: 'saml2-signon-post'
sogod [11]: |SOGo| set clientObject: <SOGo[0x0x5a13bcaa3e80]:
name=SOGo>
sogod[11:11] EXCEPTION: <NSException: 0x5a13bcc7dd10>
NAME:NSInvalidArgumentException REASON:Tried to add nil value for
key 'login' to dictionary INFO:{}
```
Any idea why SOGo (or is it a library like lasso) would generate
"Tried to add nil value for key 'login' to dictionary INFO:{}"?
//Anton
From my limited ability to debug Objective-C, it appears that the
error is caused by:
https://github.com/Alinto/sogo/blob/b602b2b188ce6c331875450c6b1dbe4
8
240f4ff7/UI/MainUI/SOGoSAML2Actions.m#L176
```
newSession = [SOGoSAML2Session SAML2SessionInContext: context];
[newSession processAuthnResponse: [rq formValueForKey:
@"SAMLResponse"]];
login = [newSession login];
```
Where `[newSession login]` is `nil`?
My next obsession is going to be guessing what could be missing in
the SAML response from keycloak. Here's the post-back data from
keycloak: https://0x0.st/XWZs.txt
//Anton
I also keep getting:
```
(process:907): Lasso-CRITICAL **: 11:28:20.078: 2024-08-10 11:28:20
(profile.c/:913) Trying to unref a non GObject pointer
file=profile.c:913 pointerbybname=profile->identity
pointer=0x5f5b99c00e40 (process:907): Lasso-CRITICAL **:
11:28:20.078: 2024-08-10 11:28:20 (profile.c/:916) Trying to unref a
non GObject pointer file=profile.c:916
pointerbybname=profile->session pointer=0x5f5b99cb6ae0 ```
And in a production container I get coredump:ed:
```
2024-08-10 12:05:10.867 sogod[34:34] PG0x0x5e19c98edae0 SQL: COMMIT
TRANSACTION
(process:34): Lasso-CRITICAL **: 12:05:10.886: 2024-08-10 12:05:10
(profile.c/:913) Trying to unref a non GObject pointer
file=profile.c:913 pointerbybname=profile->identity
pointer=0x5e19ca0f5d90
2024-08-10 12:05:11.077 sogod[28:28] INFO(-[NGActiveSocket isAlive])
poll(): fd=7 revents=0x0011)
Aug 10 12:05:11 sogod [28]: <0x0x5e19c9ccd250[WOWatchDogChild]>
child
34 exited
Aug 10 12:05:11 sogod [28]: <0x0x5e19c9ccd250[WOWatchDogChild]>
(terminated due to signal 11, coredump) ```
Which according to some bug reports should already be handled:
- https://bugs.sogo.nu//view.php?id=5153
- https://bugs.sogo.nu/view.php?id=5270
- https://bugs.sogo.nu/view.php?id=5153
The exception appears to happen only to the two keys in `lassoLogin`
that has been "dumped" or loaded by a string:
https://github.com/Alinto/sogo/blob/b602b2b188ce6c331875450c6b1dbe48
2
40f4ff7/SoObjects/SOGo/SOGoSAML2Session.m#L354-L362
And while debugging, with my limited knowledge, it looks like I get
a segfault right after the response is being sent.
```
gdb \
-ex 'set breakpoint pending on' \
-ex 'break SOGoSAML2Actions.m:174' \
-ex 'run' \
--args /usr/bin/sogod -WOUseWatchDog NO -SOGoDebugRequests YES
-WONoDetach YES -WOPort 0.0.0.0:20000 -WOWorkersCount 1 -WOLogFile -
-WOPidFile /tmp/sogo.pid ```
Generates:
```
Aug 10 11:57:30 sogod [271]: |SOGo| WOHttpAdaptor listening on
address 0.0.0.0:20000 Aug 10 11:57:31 sogod [271]: |SOGo| starting
method 'POST' on uri '/SOGo/saml2-signon-post'
Aug 10 11:57:31 sogod [271]: <0x0x582dab9c4650[SOGoCache]> Cache
cleanup interval set every 3600.000000 seconds Aug 10 11:57:31 sogod
[271]: <0x0x582dab9c4650[SOGoCache]> Using
host(s) 'memcached:11211' as server(s) Aug 10 11:57:31 sogod [271]:
|SOGo| traverse(acquire): SOGo => saml2-signon-post Aug 10 11:57:31
sogod [271]: |SOGo| do traverse name: 'SOGo'
Aug 10 11:57:31 sogod [271]: |SOGo| do traverse name:
'saml2-signon-post'
Aug 10 11:57:31 sogod [271]: |SOGo| set clientObject:
<SOGo[0x0x582dab9cb550]: name=SOGo>
Breakpoint 1, -[SOGoSAML2Actions saml2SignOnPOSTAction]
(self=0x582dabbea890, _cmd=0x582dabb12e90) at
/usr/src/debug/sogo/SOGo-5.10.0/UI/MainUI/SOGoSAML2Actions.m:174
174 newSession = [SOGoSAML2Session SAML2SessionInContext:
context];
(gdb) next
175 [newSession processAuthnResponse: [rq formValueForKey:
@"SAMLResponse"]];
176 login = [newSession login];
178 application = [SoApplication application];
179 auth = [application authenticatorInContext: context];
182 inContext: context];
181 andPassword: [newSession
identifier]
182 inContext: context];
2024-08-10 11:57:42.261 sogod[271:271] PostgreSQL72 connection
established: <0x0x582dabcc6000[PGConnection]:
connection=0x0x582dabcb4c30>
2024-08-10 11:57:42.261 sogod[271:271] PostgreSQL72 channel
0x0x582dabcb8840 opened (connection=<0x0x582dabcc6000[PGConnection]:
connection=0x0x582dabcb4c30>, count=2)
2024-08-10 11:57:42.262 sogod[271:271] PG0x0x582dabcb8840 SQL: BEGIN
TRANSACTION
2024-08-10 11:57:42.262 sogod[271:271] PG0x0x582dabcb8840 SQL:
SELECT t1.c_creationdate, t1.c_id, t1.c_lastseen, t1.c_value FROM
sogo_sessions_folder t1 WHERE t1.c_id='Gq...+62v'
2024-08-10 11:57:42.263 sogod[271:271] PG0x0x582dabcb8840 SQL:
ROLLBACK TRANSACTION
2024-08-10 11:57:42.263 sogod[271:271] PG0x0x582dab5d4ae0 SQL: BEGIN
TRANSACTION
2024-08-10 11:57:42.263 sogod[271:271] PG0x0x582dab5d4ae0 SQL:
INSERT INTO sogo_sessions_folder (c_lastseen, c_creationdate,
c_value, c_id) VALUES (1723283862, 1723283862, '91R...EmI',
'Gq...+62v')
2024-08-10 11:57:42.264 sogod[271:271] PG0x0x582dab5d4ae0 SQL:
COMMIT TRANSACTION
185 creds = [auth parseCredentials: [authCookie value]];
187 value: [[SOGoSession
valueForSessionKey: [creds lastObject]] asSHA1String]];
188 [xsrfCookie setPath: [NSString stringWithFormat:
@"/%@/", [[context request] applicationName]]];
189 [response addCookie: xsrfCookie];
191 oldLocation = [[context clientObject] baseURLInContext:
context];
193 oldLocation, [login
stringByEscapingURL]];
195 [response setStatus: 302];
196 [response setHeader: newLocation forKey: @"location"];
197 [response addCookie: authCookie];
205 return response;
206 }
-[SoActionInvocation
callOnObject:withPositionalParametersWhenNotNil:inContext:]
(self=<optimized out>, _cmd=<optimized out>, _client=<optimized
out>, _positionalArgs=0x0, _ctx=0x582dab9d1b90)
at SoObjects/SoActionInvocation.m:310
310 result = [result retain];
311 [method release]; method = nil;
312 return [result autorelease];
0x00007e26c476bb40 in ?? () from /usr/lib/libgnustep-base.so.1.29
Cannot find bounds of current function Cannot find bounds of current
function Cannot find bounds of current function
(gdb) c
Continuing.
(process:271): Lasso-CRITICAL **: 11:57:50.259: 2024-08-10 11:57:50
(profile.c/:913) Trying to unref a non GObject pointer
file=profile.c:913 pointerbybname=profile->identity
pointer=0x582dabcb4de0
Program received signal SIGSEGV, Segmentation fault.
0x00007e26c3fb1f81 in g_type_check_instance_is_fundamentally_a ()
from /usr/lib/libgobject-2.0.so.0 ```
Any guidance or assistance here would be greatly appreciated as I'm
way out in deep water.
My two main concerns are:
- What's missing from KeyCloak or what fields/data are wrong in the
SAML2 response that could cause this
- How do I patch the lassoLogin data to not cause coredump/critical
errors
//Anton
Continuing on to debug this, it appears that:
On a good run (before auth is done), de-allocation in lasso looks like:
```
912 lasso_mem_debug("LassoProfile", "Identity",
profile->identity);
(gdb) print profile->identity
$2 = (LassoIdentity *) 0x0
freeing LassoProfile/Identity (at (nil))
913 lasso_release_gobject(profile->identity);
```
On a bad run, it looks like this:
```
912 lasso_mem_debug("LassoProfile", "Identity",
profile->identity);
(gdb) print profile->identity
$4 = (LassoIdentity *) 0x614290dbbc10
freeing LassoProfile/Identity (at 0x614290dbbc10)
913 lasso_release_gobject(profile->identity);
Program received signal SIGSEGV, Segmentation fault.
```
Either way, it appears to hit after the request is sent to the user
so it shouldn't matter for debugging what's missing between SOGo and
KeyCloak SAML2 response.
But it appears that SOGo treats my SAML2 user as anonymous or
non-existant:
```
if (!user || [[user login] isEqualToString: @"anonymous"]) ```
However when adding in authentication using SQL:
https://www.sogo.nu/files/docs/SOGoInstallationGuide.html#Authenticat
i on-using-SQL I got a bit further but had to manually manipulate the
url from https://sogo.domain.com//anton to
https://sogo.domain.com/so/anton
As if it didn't really get that it should redirect to `/so/anton`.
Which got me thinking, that something was wrong with an attribute
somewhere.
It looks like changing mapper type from "User Attribute" to "User
Property" for email and username gets me further.
So I tried removing SQL Auth source, which caused the user to not be
able to login again.
But adding back in SQL authentication causes the `/so/<user>` URL to
present a HTTP status code causing a plain HTTP login window:
```
SOGoUserSources =
(
{
type = sql;
id = vmail_mailbox;
viewURL = "postgresql://sogo:sogo@sogo_db:5432/sogo/users";
canAuthenticate = YES;
isAddressBook = YES;
userPasswordAlgorithm = md5;
prependPasswordScheme = YES;
displayName = "Global Address Book";
}
);
```
psql -U sogo sogo -c "CREATE TABLE IF NOT EXISTS users (c_uid
VARCHAR(255), c_name VARCHAR(255), c_password VARCHAR(255), c_cn
VARCHAR(255), mail VARCHAR(255))"
psql -U sogo sogo -c "INSERT INTO users (c_uid, c_name, c_password,
c_cn, mail) VALUES ('anton', 'anton',
'098f6bcd4621d373cade4e832627b4f6', 'anton', '[email protected]')"
Resetting everything to make sure I didn't do more mistakes while
debugging, I now hit: LASSO_DS_ERROR_CA_CERT_CHAIN_LOAD_FAILED
Sorry for the noise but the SAML2 docs are slightly limited and I
have no idea how to properly interpret these debug messages.
//Anton
So the LASSO_DS_ERROR_CA_CERT_CHAIN_LOAD_FAILED appears to happen
because the extracted certificate and public key from KeyCloak does
not
contain:
-----BEGIN PRIVATE KEY-----
and not
-----BEGIN PUBLIC KEY-----
And the HTTP authentication appears to be a faulty apache configuration taken
from the example conf.
But SOGo will not function without a SOGoUserSources alongside the SAML2 login
for (to me) unknown reasons.
Apologies for the noise, but debugging the SAML stuff was not easy and I was
hoping someone else had bumped into similar issues.
Thanks for Jeroen in a previous mail thread:
https://www.mail-archive.com/[email protected]/msg29861.html
It really helped out with the mappers as well as where to get the RSA keys from.
//Anton