Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-18 Thread Дилян Палаузов
Hello,

On Fri, 2018-07-13 at 10:40 +0200, Michael Menge wrote:
> 
> I have been running mupdate with STARTTLS for a few years with 2.3.x  
> and 2.4.x.
> It didn't crash as predictable as 3.0, but we had some instances where ...

The likely reason, why it hasn't crashеа as predictable is, that the
line DH_free(dh_params); was not part of cyrus-imapd-2.4  . It was
added to the caldav-2.4 branch back in 2014 and from there stayed in
the 3.0 and master branches.  Once a memory leak was found in 2017 on
the cyrus-imapd-2.5 branch, the line was added also there.  So for
fixing this on 2.5 it was a matter of having two persons find the
problem: one finds it and fixes it on caldav-2.4, and somebody else
finds it again on 2.5 two years later and arranges a fix there.

I thought everything from caldav-2.4 went into 2.5.  Now I see this was
not the case.

Please, as a rule, when fixing stuff on master (or caldav-2.4 in this
case), consider whether it is relevant also for the stable versions and
fix it there at the same time.  Postgres or Evolution are good
examples, how this process can be done.

Regards
  Дилян



Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-15 Thread Sebastian Hagedorn

out of curiosity, why is mupdate multi-threaded in the first place?


I asked around last week but it seems like that answer has been lost to
history.  It was added a long time ago, so it might have been a "threads
are cheaper than processes" decision.  The initial commit for
imap/mupdate.c (2001) describes it as "added mupdate for possible
lightweight acap replacement", but I don't know what "acap" was!


I know that, because I still use one of the very few clients that support 
ACAP, and I used to run my own ACAP server so I could use it:



--
Sebastian Hagedorn - Weyertal 121, Zimmer 2.02
Regionales Rechenzentrum (RRZK)
Universität zu Köln / Cologne University - Tel. +49-221-470-89578


Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-15 Thread ellie timoney
Hi Michael,

On Fri, Jul 13, 2018, at 6:40 PM, Michael Menge wrote:
> Hi Ellie
> 
> thanks for your replies,
> 
> Quoting ellie timoney :
> 
> > From what I'm seeing here, it looks like mupdate calls  
> > tls_init_serverengine() for each new STARTTLS session, and then  
> > calls tls_shutdown_serverengine()  when that session ends.
> >
> > The thing is though, the TLS state that these functions manage is  
> > something like a singleton, it should only exist once for each  
> > process.  And in fact, tls_init_serverengine() enforces this (which  
> > is why you don't see the dh_params being allocated for each  
> > connection: it only lets itself run once).  This works fine for all  
> > the single-threaded service programs, but mupdate is a threaded  
> > application and can have multiple connected client sessions in the  
> > same process.  The comments at the top of imap/tls.c are instructive:
> >
> >> * DESCRIPTION
> >> *   This module is the interface between Cyrus Imapd and the  
> >> OpenSSL library.
> >> *   As of now only one filedescriptor can be handled, so only one
> >> *   TLS channel can be open at a time.
> >
> > I can't see anything in the code that makes me suspect this comment  
> > is out of date.  It doesn't look the slightest bit thread-safe.
> >
> > At the moment, it looks to me like the STARTTLS support was added to  
> > mupdate with no consideration of its thread-safety, and I'm left to  
> > assume that people aren't really using STARTTLS for their mupdate  
> > connections, otherwise this probably would've been tripped over long  
> > ago.  (Or, maybe it was, and someone said "don't do that", and then  
> > it all got lost to history?  It would be interesting to hear from  
> > the front lines.)
> 
> I have been running mupdate with STARTTLS for a few years with 2.3.x  
> and 2.4.x.
> It didn't crash as predictable as 3.0, but we had some instances where the
> mupdate master crashed and stopped working and we had to restart the service
> (on an other host). At that time we where unable to debug it, as it would
> run without crash for weeks or months. I suspect now that the  
> non-thread-safe usage
> of openssl could have been the cause for this. 

That's interesting to hear.  Have your servers become busier over that time?  I 
expect that in a low traffic scenario mupdate might end up being mostly 
single-threaded in practice, allowing the issue to go unobserved, but as 
traffic increases, the number of conflicts would increase.

> Using STARTTLS for mupdate was the obvious solution after the default  
> for "allowplaintext"
> (Allow the use of cleartext passwords on the wire) was changed to 0.

You _might_ be able to do something with setting "mupdate_allowplaintext: yes" 
to allow plaintext connections only to the mupdate service without affecting 
other services, if your only need for STARTTLS in mupdate was to get around 
that default.  I can't remember how the service-specific config prefixing 
mechanism works (and can never seem to find it when I look for it in 
code/docs), so this might or might not actually work...

> >
> > Anyway, it looks to me like the STARTTLS support in mupdate is just  
> > fundamentally broken at the moment, and my recommendation is to not  
> > use it.  If your IMAP servers need to connect to an mupdate server  
> > that's not within their trusted network, I guess you'd need to set  
> > up a VPN for it or something along those lines (but I'm no network  
> > specialist).
> >
> 
> could you add a warning in the relevant murder/installation guides and  
> manuals?

Yep, that's an excellent idea. :)

> 
> > I don't honestly see this being fixed any time soon -- it would  
> > require either:
> >
> >  * a big rewrite of imap/tls.c to make it thread-safe
> >  * a big rewrite of mupdate to make it single-threaded like the rest of 
> > Cyrus
> 
> out of curiosity, why is mupdate multi-threaded in the first place?

I asked around last week but it seems like that answer has been lost to 
history.  It was added a long time ago, so it might have been a "threads are 
cheaper than processes" decision.  The initial commit for imap/mupdate.c (2001) 
describes it as "added mupdate for possible lightweight acap replacement", but 
I don't know what "acap" was!

> 
> >  * a big rewrite of mupdate to make it do its own TLS handling  
> > (rather than using imap/tls.c)
> >
> 
> This sounds like it is the same work as the first item, but it is
> only used for mupdate.
> 
> > All of these are serious tasks with serious testing requirements,  
> > especially considering the need to interact with OpenSSL correctly.   
> > Even if someone does produce patches for master, they won't make it  
> > back to the 3.0 series.
> >
> 
> so is there a chance for a patch in 3.1?

Bron has his infamous "what it would take for FastMail to use murder" list, but 
the main development on 3.1 at the moment is JMAP and Calendaring and stuff 
relating to those, and I guess 

Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-13 Thread Michael Menge

Hi Ellie

thanks for your replies,

Quoting ellie timoney :

From what I'm seeing here, it looks like mupdate calls  
tls_init_serverengine() for each new STARTTLS session, and then  
calls tls_shutdown_serverengine()  when that session ends.


The thing is though, the TLS state that these functions manage is  
something like a singleton, it should only exist once for each  
process.  And in fact, tls_init_serverengine() enforces this (which  
is why you don't see the dh_params being allocated for each  
connection: it only lets itself run once).  This works fine for all  
the single-threaded service programs, but mupdate is a threaded  
application and can have multiple connected client sessions in the  
same process.  The comments at the top of imap/tls.c are instructive:



* DESCRIPTION
*   This module is the interface between Cyrus Imapd and the  
OpenSSL library.

*   As of now only one filedescriptor can be handled, so only one
*   TLS channel can be open at a time.


I can't see anything in the code that makes me suspect this comment  
is out of date.  It doesn't look the slightest bit thread-safe.


At the moment, it looks to me like the STARTTLS support was added to  
mupdate with no consideration of its thread-safety, and I'm left to  
assume that people aren't really using STARTTLS for their mupdate  
connections, otherwise this probably would've been tripped over long  
ago.  (Or, maybe it was, and someone said "don't do that", and then  
it all got lost to history?  It would be interesting to hear from  
the front lines.)


I have been running mupdate with STARTTLS for a few years with 2.3.x  
and 2.4.x.

It didn't crash as predictable as 3.0, but we had some instances where the
mupdate master crashed and stopped working and we had to restart the service
(on an other host). At that time we where unable to debug it, as it would
run without crash for weeks or months. I suspect now that the  
non-thread-safe usage

of openssl could have been the cause for this.

Using STARTTLS for mupdate was the obvious solution after the default  
for "allowplaintext"

(Allow the use of cleartext passwords on the wire) was changed to 0.



Anyway, it looks to me like the STARTTLS support in mupdate is just  
fundamentally broken at the moment, and my recommendation is to not  
use it.  If your IMAP servers need to connect to an mupdate server  
that's not within their trusted network, I guess you'd need to set  
up a VPN for it or something along those lines (but I'm no network  
specialist).




could you add a warning in the relevant murder/installation guides and  
manuals?


I don't honestly see this being fixed any time soon -- it would  
require either:


 * a big rewrite of imap/tls.c to make it thread-safe
 * a big rewrite of mupdate to make it single-threaded like the rest of Cyrus


out of curiosity, why is mupdate multi-threaded in the first place?

 * a big rewrite of mupdate to make it do its own TLS handling  
(rather than using imap/tls.c)




This sounds like it is the same work as the first item, but it is
only used for mupdate.

All of these are serious tasks with serious testing requirements,  
especially considering the need to interact with OpenSSL correctly.   
Even if someone does produce patches for master, they won't make it  
back to the 3.0 series.




so is there a chance for a patch in 3.1?


Sorry to be the bearer of annoying news!


I will see how I can work around this.

The cyrus project has much improved since fastmail dedicated
some developers to it, but I think it is time that fastmail switches
to a murder setup, so that this part of the code gets the same love
and testing as the other parts of cyrus ;-)

Thanks to all helping debugging this problem. I will gladly help
with testing any fixes for this bug.

Kind regards,

   Michael


M.MengeTel.: (49) 7071/29-70316
Universität Tübingen   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung  mail:  
michael.me...@zdv.uni-tuebingen.de

Wächterstraße 76
72074 Tübingen



Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-12 Thread ellie timoney
>From what I'm seeing here, it looks like mupdate calls tls_init_serverengine() 
>for each new STARTTLS session, and then calls tls_shutdown_serverengine()  
>when that session ends.

The thing is though, the TLS state that these functions manage is something 
like a singleton, it should only exist once for each process.  And in fact, 
tls_init_serverengine() enforces this (which is why you don't see the dh_params 
being allocated for each connection: it only lets itself run once).  This works 
fine for all the single-threaded service programs, but mupdate is a threaded 
application and can have multiple connected client sessions in the same 
process.  The comments at the top of imap/tls.c are instructive:

> * DESCRIPTION
> *   This module is the interface between Cyrus Imapd and the OpenSSL 
> library.
> *   As of now only one filedescriptor can be handled, so only one
> *   TLS channel can be open at a time.

I can't see anything in the code that makes me suspect this comment is out of 
date.  It doesn't look the slightest bit thread-safe.

At the moment, it looks to me like the STARTTLS support was added to mupdate 
with no consideration of its thread-safety, and I'm left to assume that people 
aren't really using STARTTLS for their mupdate connections, otherwise this 
probably would've been tripped over long ago.  (Or, maybe it was, and someone 
said "don't do that", and then it all got lost to history?  It would be 
interesting to hear from the front lines.)

Anyway, it looks to me like the STARTTLS support in mupdate is just 
fundamentally broken at the moment, and my recommendation is to not use it.  If 
your IMAP servers need to connect to an mupdate server that's not within their 
trusted network, I guess you'd need to set up a VPN for it or something along 
those lines (but I'm no network specialist).

I don't honestly see this being fixed any time soon -- it would require either:

 * a big rewrite of imap/tls.c to make it thread-safe
 * a big rewrite of mupdate to make it single-threaded like the rest of Cyrus
 * a big rewrite of mupdate to make it do its own TLS handling (rather than 
using imap/tls.c)

All of these are serious tasks with serious testing requirements, especially 
considering the need to interact with OpenSSL correctly.  Even if someone does 
produce patches for master, they won't make it back to the 3.0 series.

Sorry to be the bearer of annoying news!

ellie

On Fri, Jul 13, 2018, at 12:57 PM, ellie timoney wrote:
> I'm still digging, but if you amend your log patch to also NULL out 
> dh_params after it frees it, that would at least prevent it trying to 
> free it over and over:
> 
>   #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
> -if (dh_params) DH_free(dh_params);
> +if (dh_params) {
> +   syslog(LOG_CRIT, "dh_params will be freed %p", dh_params);
> +   DH_free(dh_params);
> +   dh_params = NULL;
> +   syslog(LOG_CRIT, "dh_params were freed %p", dh_params);
> +   }
>   #endif
> 
> On Fri, Jul 13, 2018, at 12:47 PM, ellie timoney wrote:
> > > so it seems to me that the dh_params were set once on startup but  
> > > freed for each closed connection
> > 
> > Yikes :o
> > 
> > On Thu, Jul 12, 2018, at 7:09 PM, Michael Menge wrote:
> > > Hi,
> > > 
> > > Дилян had suggested to add some debug outputs to imap/tls.c
> > > 
> > > 
> > > 
> > > diff --git a/imap/tls.c b/imap/tls.c
> > > --- a/imap/tls.c
> > > +++ b/imap/tls.c
> > > @@ -893,7 +893,9 @@ EXPORTED int tls_init_serverengine(const char 
> > > *ident,
> > > 
> > >   #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
> > >   /* Load DH params for DHE-* key exchanges */
> > > +syslog(LOG_CRIT, "dh_params will be set, current value=%p", 
> > > dh_params);
> > >   dh_params = load_dh_param(server_key_file, server_cert_file);
> > > +syslog(LOG_CRIT, "dh_params were set current_value=%p", dh_params);
> > >   SSL_CTX_set_tmp_dh(s_ctx, dh_params);
> > >   #endif
> > > 
> > > @@ -1308,7 +1310,11 @@ EXPORTED int tls_shutdown_serverengine(void)
> > >   }
> > > 
> > >   #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
> > > -if (dh_params) DH_free(dh_params);
> > > +if (dh_params) {
> > > +   syslog(LOG_CRIT, "dh_params will be freed %p", dh_params);
> > > +   DH_free(dh_params);
> > > +   syslog(LOG_CRIT, "dh_params were freed %p", dh_params);
> > > +   }
> > >   #endif
> > >   }
> > > -
> > > 
> > > I did run mupdate with this debug output
> > > 
> > > 
> > > Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params will be set, current  
> > > value=(nil)
> > > Jul 12 10:17:25 mx02 mu/mupdate[6537]: inittls: Loading DH parameters  
> > > from file
> > > Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params were set  
> > > current_value=0x7fc7541b9600
> > > Jul 12 10:17:25 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
> > > ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits 

Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-12 Thread ellie timoney
I'm still digging, but if you amend your log patch to also NULL out dh_params 
after it frees it, that would at least prevent it trying to free it over and 
over:

  #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
-if (dh_params) DH_free(dh_params);
+if (dh_params) {
+   syslog(LOG_CRIT, "dh_params will be freed %p", dh_params);
+   DH_free(dh_params);
+   dh_params = NULL;
+   syslog(LOG_CRIT, "dh_params were freed %p", dh_params);
+   }
  #endif

On Fri, Jul 13, 2018, at 12:47 PM, ellie timoney wrote:
> > so it seems to me that the dh_params were set once on startup but  
> > freed for each closed connection
> 
> Yikes :o
> 
> On Thu, Jul 12, 2018, at 7:09 PM, Michael Menge wrote:
> > Hi,
> > 
> > Дилян had suggested to add some debug outputs to imap/tls.c
> > 
> > 
> > 
> > diff --git a/imap/tls.c b/imap/tls.c
> > --- a/imap/tls.c
> > +++ b/imap/tls.c
> > @@ -893,7 +893,9 @@ EXPORTED int tls_init_serverengine(const char 
> > *ident,
> > 
> >   #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
> >   /* Load DH params for DHE-* key exchanges */
> > +syslog(LOG_CRIT, "dh_params will be set, current value=%p", dh_params);
> >   dh_params = load_dh_param(server_key_file, server_cert_file);
> > +syslog(LOG_CRIT, "dh_params were set current_value=%p", dh_params);
> >   SSL_CTX_set_tmp_dh(s_ctx, dh_params);
> >   #endif
> > 
> > @@ -1308,7 +1310,11 @@ EXPORTED int tls_shutdown_serverengine(void)
> >   }
> > 
> >   #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
> > -if (dh_params) DH_free(dh_params);
> > +if (dh_params) {
> > +   syslog(LOG_CRIT, "dh_params will be freed %p", dh_params);
> > +   DH_free(dh_params);
> > +   syslog(LOG_CRIT, "dh_params were freed %p", dh_params);
> > +   }
> >   #endif
> >   }
> > -
> > 
> > I did run mupdate with this debug output
> > 
> > 
> > Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params will be set, current  
> > value=(nil)
> > Jul 12 10:17:25 mx02 mu/mupdate[6537]: inittls: Loading DH parameters  
> > from file
> > Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params were set  
> > current_value=0x7fc7541b9600
> > Jul 12 10:17:25 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
> > ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
> > Jul 12 10:17:25 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
> > [10.23.21.78] cyrus PLAIN+TLS User logged in
> > Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
> > ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
> > Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
> > [10.23.21.78] cyrus PLAIN+TLS User logged in
> > Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 
> > 0x7fc7541b9600
> > Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params were freed 0x7fc7541b9600
> > Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
> > ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
> > Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
> > [10.23.21.78] cyrus PLAIN+TLS User logged in
> > Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 
> > 0x7fc7541b9600
> > Jul 12 10:18:38 mx02 mu/master[6534]: process type:SERVICE  
> > name:mupdate path:/usr/local/libexec/mupdate age:89.131s pid:6537  
> > signaled to death by signal 11 (Segmentation fault, core dumped)
> > Jul 12 10:18:38 mx02 mu/master[6534]: service mupdate/ipv4 pid 6537 in  
> > READY state: terminated abnormally
> > 
> > so it seems to me that the dh_params were set once on startup but  
> > freed for each closed connection
> > 
> > Regards
> > 
> >  Michael Menge
> > 
> > 
> > Quoting Michael Menge :
> > 
> > > Hi Дилян,
> > >
> > >
> > > Quoting Дилян Палаузов :
> > >
> > >> Hello Michael,
> > >>
> > >> this is likely either a memory mishandling issue (use after free(),
> > >> double free(), invalid read()/write()...), which gets evident if cyrus
> > >> is run under valgrind --tool=memcheck.  I run it with
> > >>
> > >> valgrind --num-callers=30 --leak-check=full --track-origins=yes --read-
> > >> var-info=yes --show-leak-kinds=all --trace-children=yes --track-fds=yes
> > >> /usr/local/cyrus/bin/master -D &> log-file-memcheck
> > >>
> > >
> > > thanks for the valgind options. Valgrind did find some "Invalid read  
> > > of size 4"
> > > and "size 8" in DH_free as well as "Invalid write of size 4" in  
> > > CRYPTO_add_lock
> > > and "size 8" in OPENSSL_cleanse. As far as i can tell the memory was  
> > > always free'd
> > > before by CRYPTO_free (mem.c:434).
> > >
> > > I have attached the full log
> > >
> > >>
> > >> Another reason can be multi-threaded inconsistencies: mutexes locked in
> > >> inconsistent order by differnt threads (while this is not a cause for
> > >> crash, it leads to deadlock), mutexes locked in one thread and unlocked
> > >> in another or 

Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-12 Thread ellie timoney
> so it seems to me that the dh_params were set once on startup but  
> freed for each closed connection

Yikes :o

On Thu, Jul 12, 2018, at 7:09 PM, Michael Menge wrote:
> Hi,
> 
> Дилян had suggested to add some debug outputs to imap/tls.c
> 
> 
> 
> diff --git a/imap/tls.c b/imap/tls.c
> --- a/imap/tls.c
> +++ b/imap/tls.c
> @@ -893,7 +893,9 @@ EXPORTED int tls_init_serverengine(const char *ident,
> 
>   #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
>   /* Load DH params for DHE-* key exchanges */
> +syslog(LOG_CRIT, "dh_params will be set, current value=%p", dh_params);
>   dh_params = load_dh_param(server_key_file, server_cert_file);
> +syslog(LOG_CRIT, "dh_params were set current_value=%p", dh_params);
>   SSL_CTX_set_tmp_dh(s_ctx, dh_params);
>   #endif
> 
> @@ -1308,7 +1310,11 @@ EXPORTED int tls_shutdown_serverengine(void)
>   }
> 
>   #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
> -if (dh_params) DH_free(dh_params);
> +if (dh_params) {
> +   syslog(LOG_CRIT, "dh_params will be freed %p", dh_params);
> +   DH_free(dh_params);
> +   syslog(LOG_CRIT, "dh_params were freed %p", dh_params);
> +   }
>   #endif
>   }
> -
> 
> I did run mupdate with this debug output
> 
> 
> Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params will be set, current  
> value=(nil)
> Jul 12 10:17:25 mx02 mu/mupdate[6537]: inittls: Loading DH parameters  
> from file
> Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params were set  
> current_value=0x7fc7541b9600
> Jul 12 10:17:25 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
> ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
> Jul 12 10:17:25 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
> [10.23.21.78] cyrus PLAIN+TLS User logged in
> Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
> ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
> Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
> [10.23.21.78] cyrus PLAIN+TLS User logged in
> Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 0x7fc7541b9600
> Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params were freed 0x7fc7541b9600
> Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
> ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
> Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
> [10.23.21.78] cyrus PLAIN+TLS User logged in
> Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 0x7fc7541b9600
> Jul 12 10:18:38 mx02 mu/master[6534]: process type:SERVICE  
> name:mupdate path:/usr/local/libexec/mupdate age:89.131s pid:6537  
> signaled to death by signal 11 (Segmentation fault, core dumped)
> Jul 12 10:18:38 mx02 mu/master[6534]: service mupdate/ipv4 pid 6537 in  
> READY state: terminated abnormally
> 
> so it seems to me that the dh_params were set once on startup but  
> freed for each closed connection
> 
> Regards
> 
>  Michael Menge
> 
> 
> Quoting Michael Menge :
> 
> > Hi Дилян,
> >
> >
> > Quoting Дилян Палаузов :
> >
> >> Hello Michael,
> >>
> >> this is likely either a memory mishandling issue (use after free(),
> >> double free(), invalid read()/write()...), which gets evident if cyrus
> >> is run under valgrind --tool=memcheck.  I run it with
> >>
> >> valgrind --num-callers=30 --leak-check=full --track-origins=yes --read-
> >> var-info=yes --show-leak-kinds=all --trace-children=yes --track-fds=yes
> >> /usr/local/cyrus/bin/master -D &> log-file-memcheck
> >>
> >
> > thanks for the valgind options. Valgrind did find some "Invalid read  
> > of size 4"
> > and "size 8" in DH_free as well as "Invalid write of size 4" in  
> > CRYPTO_add_lock
> > and "size 8" in OPENSSL_cleanse. As far as i can tell the memory was  
> > always free'd
> > before by CRYPTO_free (mem.c:434).
> >
> > I have attached the full log
> >
> >>
> >> Another reason can be multi-threaded inconsistencies: mutexes locked in
> >> inconsistent order by differnt threads (while this is not a cause for
> >> crash, it leads to deadlock), mutexes locked in one thread and unlocked
> >> in another or alike.  This can be detected by valgrind/helgrind
> >>
> >> valgrind --tool=helgrind --num-callers=30 --leak-check=full --track-
> >> origins=yes --read-var-info=yes --trace-children=yes --track-fds=yes
> >> /usr/local/cyrus/bin/master -D &> log-file-helgrind
> >>
> >
> > I had to remove some options here, as my valgrind didn't know them with
> > --tool=helgrind
> >
> > I used /usr/bin/valgrind --tool=helgrind --num-callers=30  
> > --read-var-info=yes --trace-children=yes --track-fds=yes  
> > /usr/local/libexec/master -C /etc/imapd_mu.conf -M  
> > /etc/cyrus_mu.conf -p /var/run/cyurs_mu.pid -D &>  
> > /tmp/cyrus-mupdate-log-file-helgrind
> >
> > Valgrind did find some "Possible data race during read" and "This  
> > conflicts with a previous write"
> > I have attached the full log as well
> >
> >

Re: use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

2018-07-12 Thread Michael Menge

Hi,

Дилян had suggested to add some debug outputs to imap/tls.c



diff --git a/imap/tls.c b/imap/tls.c
--- a/imap/tls.c
+++ b/imap/tls.c
@@ -893,7 +893,9 @@ EXPORTED int tls_init_serverengine(const char *ident,

 #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
 /* Load DH params for DHE-* key exchanges */
+syslog(LOG_CRIT, "dh_params will be set, current value=%p", dh_params);
 dh_params = load_dh_param(server_key_file, server_cert_file);
+syslog(LOG_CRIT, "dh_params were set current_value=%p", dh_params);
 SSL_CTX_set_tmp_dh(s_ctx, dh_params);
 #endif

@@ -1308,7 +1310,11 @@ EXPORTED int tls_shutdown_serverengine(void)
 }

 #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
-if (dh_params) DH_free(dh_params);
+if (dh_params) {
+   syslog(LOG_CRIT, "dh_params will be freed %p", dh_params);
+   DH_free(dh_params);
+   syslog(LOG_CRIT, "dh_params were freed %p", dh_params);
+   }
 #endif
 }
-

I did run mupdate with this debug output


Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params will be set, current  
value=(nil)
Jul 12 10:17:25 mx02 mu/mupdate[6537]: inittls: Loading DH parameters  
from file
Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params were set  
current_value=0x7fc7541b9600
Jul 12 10:17:25 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
Jul 12 10:17:25 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
[10.23.21.78] cyrus PLAIN+TLS User logged in
Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
[10.23.21.78] cyrus PLAIN+TLS User logged in

Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 0x7fc7541b9600
Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params were freed 0x7fc7541b9600
Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
[10.23.21.78] cyrus PLAIN+TLS User logged in

Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 0x7fc7541b9600
Jul 12 10:18:38 mx02 mu/master[6534]: process type:SERVICE  
name:mupdate path:/usr/local/libexec/mupdate age:89.131s pid:6537  
signaled to death by signal 11 (Segmentation fault, core dumped)
Jul 12 10:18:38 mx02 mu/master[6534]: service mupdate/ipv4 pid 6537 in  
READY state: terminated abnormally


so it seems to me that the dh_params were set once on startup but  
freed for each closed connection


Regards

Michael Menge


Quoting Michael Menge :


Hi Дилян,


Quoting Дилян Палаузов :


Hello Michael,

this is likely either a memory mishandling issue (use after free(),
double free(), invalid read()/write()...), which gets evident if cyrus
is run under valgrind --tool=memcheck.  I run it with

valgrind --num-callers=30 --leak-check=full --track-origins=yes --read-
var-info=yes --show-leak-kinds=all --trace-children=yes --track-fds=yes
/usr/local/cyrus/bin/master -D &> log-file-memcheck



thanks for the valgind options. Valgrind did find some "Invalid read  
of size 4"
and "size 8" in DH_free as well as "Invalid write of size 4" in  
CRYPTO_add_lock
and "size 8" in OPENSSL_cleanse. As far as i can tell the memory was  
always free'd

before by CRYPTO_free (mem.c:434).

I have attached the full log



Another reason can be multi-threaded inconsistencies: mutexes locked in
inconsistent order by differnt threads (while this is not a cause for
crash, it leads to deadlock), mutexes locked in one thread and unlocked
in another or alike.  This can be detected by valgrind/helgrind

valgrind --tool=helgrind --num-callers=30 --leak-check=full --track-
origins=yes --read-var-info=yes --trace-children=yes --track-fds=yes
/usr/local/cyrus/bin/master -D &> log-file-helgrind



I had to remove some options here, as my valgrind didn't know them with
--tool=helgrind

I used /usr/bin/valgrind --tool=helgrind --num-callers=30  
--read-var-info=yes --trace-children=yes --track-fds=yes  
/usr/local/libexec/master -C /etc/imapd_mu.conf -M  
/etc/cyrus_mu.conf -p /var/run/cyurs_mu.pid -D &>  
/tmp/cyrus-mupdate-log-file-helgrind


Valgrind did find some "Possible data race during read" and "This  
conflicts with a previous write"

I have attached the full log as well




Of course, it is useful to have debugging symbols.  Compiling with -O3
migh make things faster, while compiling with -O0 will make it run
considerably slower. If ld (=ld.bfd) is very, very recent (e.g. 2.31),
then link with  -Wl,-z,noseparate-code otherwise valgrind cannot read
the debug info (https://bugs.kde.org/show_bug.cgi?id=395682).

Then it shall be possible to see in one or the other log files what is
relevant for causing SIGSERV, in particular the first place where bad
operation happens, which might be