Re: [exim] stuck exim processes

2022-02-17 Thread Evgeniy Berdnikov via Exim-users
On Thu, Feb 17, 2022 at 07:36:38PM -0800, Michael Tratz via Exim-users wrote:
> > On Feb 16, 2022, at 4:17 PM, Jeremy Harris via Exim-users 
> >  wrote:
> > 
> > You don't even get a single line from truss as it attaches?
> > I wonder if the process is spinning in userland?
> > Does "top" or similar show it?
> 
> That stuck process is just sitting there and not doing anything. It still 
> shows in top, but it’s just idle.

 Process can't be "just idle", it must have its state and data structures,
 the most interesting is stack. Process state may displayed with "ps wchan":

 ps -p  -o pid,wchan,cmd

 Stack may be printed by debugger:

 gdb -p  -f /path/to/exim
 (gdb) bt full

 But output from gdb won't be useful unless gdb can access to tables
 with local symbols, as Jeremy pointed:

> > If it is, I guess the next step would be to crash it with
> > a signal, having set up for coredumps (NB, exim is a setuid binary
> > in most installations.  Security considerations apply).
> > Of course, it was likely compiled with full optimisation
> > which will hinder us.  Having a "-O0 -ggdb" build would
> > help.  I don't know what FreeBSD does about debuginfo;
> > is that likely to be a separate install item, to get
> > symbols for the binary?

 The puzzling thing is that truss "shows nothing" on attach.
 It may be indication of some kernel bug, which may also affect behaviour
 of gdb. I suspect we observe abnormal process state similar to "zombie",
 say, process deletion is not completed but userspace data structures
 are already deleted.
-- 
 Eugene Berdnikov

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] stuck exim processes

2022-02-17 Thread Michael Tratz via Exim-users


> On Feb 16, 2022, at 4:17 PM, Jeremy Harris via Exim-users 
>  wrote:
> 
> (taking the 2xx variant)
> 
>> I tried truss to trace the system calls of both processes. 77631 is not 
>> printing anything.
> 
> You don't even get a single line from truss as it attaches?
> I wonder if the process is spinning in userland?
> Does "top" or similar show it?

That stuck process is just sitting there and not doing anything. It still shows 
in top, but it’s just idle.
I will see if I can find another one and let truss run overnight maybe it will 
print something after waiting for a long time. I may not have been patient 
enough. :-)

> 
> If it is, I guess the next step would be to crash it with
> a signal, having set up for coredumps (NB, exim is a setuid binary
> in most installations.  Security considerations apply).
> Of course, it was likely compiled with full optimisation
> which will hinder us.  Having a "-O0 -ggdb" build would
> help.  I don't know what FreeBSD does about debuginfo;
> is that likely to be a separate install item, to get
> symbols for the binary?

I haven’t had time to recompile the port with the debug symbols. I will do so 
once I can. Right now I’m pretty swamped with other things on my to-do list.

> 
> 
> 
>> This happens for messages which get a 4xx or 5xx error.
> 
> For all 4xx/5xx ?  Or "when it goes wrong, for those, that's
> how it goes" ?

Again not for all. Just if it goes wrong exim does that same pattern. I did 
notice it happens consistently with certain remote servers. For example. 
smtp.secureserver.net  (GoDaddy) seems to always 
be a good culprit. I see a lot issues with them. But this wasn’t the case with 
4.94.2 Another one was cloudmail102.zonecybersite.com 
. There are more, but I just started 
looking if I add certain servers to hosts_avoid_tls if I get less of those 
stuck processes.

Thanks,

Michael

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Truncated warning messages (again)

2022-02-17 Thread Christian Balzer via Exim-users
On Thu, 17 Feb 2022 11:25:01 + Jeremy Harris via Exim-users wrote:

> On 17/02/2022 05:04, Christian Balzer via Exim-users wrote:
> > Maybe phrasing here, but clearly the previous behavior of displaying the
> > full response of the remote SMTP server is more "beautiful" than the
> > truncated to the point of unreadable one with current Exim versions?  
> 
> Oh, you are comparing to a previous Exim version.  I suggest you
> log a bug, giving details of the versions, and any difference
> in the configs used.  It would also help to know what the state
> of the retry DB is for the problem address prior to the delivery
> attempt that results in a problem bounce message.
> 

Well, after re-jiggering my ancient bugzilla account I found it was
already reported in all its glory, 2 years ago.
You weren't kidding when you said a fix would not be quick, but this is
still a major regression in my book...

https://bugs.exim.org/show_bug.cgi?id=2535


Chibi
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Communications

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] bypassing the bogofilter check

2022-02-17 Thread Sławomir Dworaczek via Exim-users

Hello
bogofilter is an anti-spam filter, the point is that emails that are tagged
as spam from a certain address, e.g. f...@bar.fo and are to be sent to the
user and not to spam, the rest of the e-mails are to be filtered

regards
Slawek


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Stuck processes trying to deliver message

2022-02-17 Thread Patrik Peng via Exim-users

On 14.02.22 17:50, Patrik Peng via Exim-users wrote:

On 14.02.22 16:27, Jeremy Harris via Exim-users wrote:

Next time you get one, please run it with "-d+all" rather
than "-v".

Will do.


---8<---
15:56:41 87345   SMTP<< 250 2.0.0 from MTA(smtp:[172.22.160.16]:10025): 
250 2.0.0 Ok: queued as A483A3FB87

15:56:41 87345 S:journalling some.u...@somehost.de
15:56:41 87345 ok=1 send_quit=1 send_rset=0 continue_more=0 yield=0 
first_address is NULL

15:56:41 87345 62.156.249.27 in hosts_nopass_tls? no (option unset)
15:56:41 87345 transport_check_waiting entered
15:56:41 87345   sequence=1 local_max=500 global_max=-1
15:56:41 87345   locking /var/spool/exim/db/wait-remote_smtp_dane.lockfile
15:56:41 87345   locked /var/spool/exim/db/wait-remote_smtp_dane.lockfile
15:56:41 87345   EXIM_DBOPEN: file 
 dir  
flags=O_RDWR

15:56:41 87345   returned from EXIM_DBOPEN: 0x801a08820
15:56:41 87345   opened hints database 
/var/spool/exim/db/wait-remote_smtp_dane: flags=O_RDWR

15:56:41 87345   dbfn_read: key=mx2.somehost.de
15:56:41 87345   EXIM_DBCLOSE(0x801a08820)
15:56:41 87345   closed hints database and lockfile
15:56:41 87345  no messages waiting for mx2.somehost.de
15:56:41 87345 transport_check_waiting: FALSE
15:56:41 87345   SMTP+> QUIT
15:56:41 87345 cmd buf flush 6 bytes (more expected)
15:56:41 87345 tls_write(0x801a2e420, 6, more)
15:56:41 87345 tls_write((nil), 0)
15:56:41 87345 SSL_write(0x8013f4000, 0x8013640b8, 6)
15:56:41 87345 outbytes=6 error=0
15:56:41 87345   SMTP(TLS shutdown)>>
15:56:41 87345 SSL3 alert write:warning:close notify
15:56:41 87345   SMTP(shutdown)>>
15:56:41 87345 Calling SSL_read(0x8013f4000, 0x801a2d420, 4096)
15:56:41 87345 read response data: size=15
15:56:41 87345   SMTP<< 221 2.0.0 Bye
15:56:41 87345 Calling SSL_read(0x8013f4000, 0x801a2d420, 4096)
15:56:42 87345 tls_close(): shutting down TLS (with response-wait)
15:56:42 87345 tls_write((nil), 0)
15:57:35 87344 polling subprocess pipes
15:58:35 87344 polling subprocess pipes
15:59:35 87344 polling subprocess pipes
16:00:35 87344 polling subprocess pipes
...
---8<---

Still no observed process crashes.
This output looks pretty similar to the one posted in the other thread 
and thus it will likely be the same issue.


Regards
Patrik



OpenPGP_signature
Description: OpenPGP digital signature
-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] exim maildirsize quota calculation in the face of symlinks

2022-02-17 Thread Cyborg via Exim-users

Am 16.02.22 um 14:46 schrieb Maarten van Baarsel via Exim-users:
I'd like to say thanks for the replies, and ask for guidance how to 
put this on the feature-addition-list so that it won't be forgotten, I 
did find the problem Cyborg was alluding to in a post from a while ago :)


I had a quick look at the code but did not see a fast path to a fix.

Maarten.





Just an idea:

Calc() ...
   array = new array();
   Loop:
   file = openfile( ... );
       if file.inode.linkcounter == 1 || array.get( file.inode.id ) == 
NULL  {

    array.put( file.inode.id );
            count file.size
       } // skip if it's a known hardlink


(note: the linkcounter check is actually obsolete, because it would not 
matter. It's just for illustration. )


IMHO, it's dovecot who's causing this by adding hardlinks to files in 
the first place. it's not Exims fault, even if it could avoid this 
"miscount" easily.


Best regards,
Marius


OpenPGP_0x048770A738345DD3.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature
-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Truncated warning messages (again)

2022-02-17 Thread Jeremy Harris via Exim-users

On 17/02/2022 05:04, Christian Balzer via Exim-users wrote:

Maybe phrasing here, but clearly the previous behavior of displaying the
full response of the remote SMTP server is more "beautiful" than the
truncated to the point of unreadable one with current Exim versions?


Oh, you are comparing to a previous Exim version.  I suggest you
log a bug, giving details of the versions, and any difference
in the configs used.  It would also help to know what the state
of the retry DB is for the problem address prior to the delivery
attempt that results in a problem bounce message.
--
Cheers,
  Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Google/gmail timeouts, IPv6 conntrack issue?

2022-02-17 Thread Kai Bojens via Exim-users

Am 17.02.22 um 10:25 schrieb Christian Balzer via Exim-users:


Can you also confirm that small mails, sessions shorter than 2 seconds
have no problem?


Yes and no – it could also be that some of Google's servers are the problem:

H=gmail-smtp-in.l.google.com [2a00:1450:400c:c0a::1b]: SMTP timeout 
after sending data block (266346 bytes written): Connection timed out


H=gmail-smtp-in.l.google.com [64.233.167.27]: SMTP timeout after sending 
data block (266346 bytes written): Connection timed out


H=alt1.gmail-smtp-in.l.google.com [2404:6800:4003:c00::1a] TFO 
X=TLS1.3:ECDHE_X25519__ECDSA_SECP256R1_SHA256__AES_256_GCM:256 CV=yes K 
C="250 2.0.0 OK XXX.YYY - gsmtp


This is the same mail with different delivery attempts, S=1119813.

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Google/gmail timeouts, IPv6 conntrack issue?

2022-02-17 Thread Jeremy Harris via Exim-users

On 17/02/2022 05:01, Christian Balzer via Exim-users wrote:

If found it excruciatingly hard to correlate tcpdump and nf_conntrack
flows, but those ICMP6 destination unreachable packets are the result of
the local iptables rejecting a connection to port 43922 (the originating
outbound SMTP session from here), something it allowed for the first 2
seconds just fine.


I agree; I was just interested in the content of the ICMP
"destination unreachable" packets - they usually carry the
start of the cause packet as data.  You might prefer to use
wireshark to poke around in the capture sample you showed.



this is likely to result in nothing at all.


Unfortunately true.
--
Cheers,
  Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Google/gmail timeouts, IPv6 conntrack issue?

2022-02-17 Thread Christian Balzer via Exim-users
On Thu, 17 Feb 2022 11:25:15 +0300 Evgeniy Berdnikov via Exim-users wrote:

> On Thu, Feb 17, 2022 at 02:01:49PM +0900, Christian Balzer via Exim-users 
> wrote
> > If found it excruciatingly hard to correlate tcpdump and nf_conntrack
> > flows,  
> 
>  These data can be related via timestamps, they may be enabled for
>  conntrack output:
> 
>conntrack -o timestamp,ktimestamp -E ...
> 
>  Note that timestamping for kernel module should be enabled via option
>  net.netfilter.nf_conntrack_timestamp (read man conntrack for details).
> 
Thanks for that info!

> > but those ICMP6 destination unreachable packets are the result of
> > the local iptables rejecting a connection to port 43922 (the originating
> > outbound SMTP session from here), something it allowed for the first 2
> > seconds just fine.
> > 
> > The:
> > ---
> > -A INPUT -p icmpv6 -j ACCEPT
> > -A INPUT -i bond+ -m state --state ESTABLISHED,RELATED -j ACCEPT
> > ---  
> 
>  No rejection rules here. Look for your iptables rules to find sources
>  of rejection, then insert logging rules to debug.
>
I was only quoting the relevant rules, as in "with that it should
work" (and it does for everybody else).

Of course there was/is reject at the end:
---
-A INPUT -i bond+ -p tcp -m tcp --dport 465 -j ACCEPT
-A INPUT -i bond+ -p tcp -m tcp --dport 587 -j ACCEPT
-A INPUT -i bond+ -p tcp -m tcp --dport 80 -j ACCEPT
-A INPUT -i bond+ -p tcp -m tcp --dport 443 -j ACCEPT
-A INPUT -p icmpv6 -j ACCEPT
-A INPUT -i bond+ -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -i bond+ -j LOG
-A INPUT -i bond+ -j REJECT
---

And plenty of rejects in the kernel log, which is how I how found out
about this in the first place.

Regards,

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Communications

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Google/gmail timeouts, IPv6 conntrack issue?

2022-02-17 Thread Christian Balzer via Exim-users
Hello,


On Thu, 17 Feb 2022 10:00:12 +0100 Kai Bojens via Exim-users wrote:

> Am 16.02.22 um 08:17 schrieb Christian Balzer via Exim-users:
> 
> > On the MSAs we added IPv6 and noticed this oddity when delivering large
> > mails (over 128KB it seems) to google/gmail via v6:  
> 
> (…)
> 
> I can confirm this and have also seen this for IPv4 this morning:
> 
Thanks for that, my sanity has been restored.

For the record, I have/had very different FW rules for v4, which explains
why this did not show on there as well.

> H=gmail-smtp-in.l.google.com [2a00:1450:400c:c0b::1a]: SMTP timeout 
> after sending data block (266346 bytes written): Connection timed out
> 
> H=gmail-smtp-in.l.google.com [142.250.147.26]: SMTP timeout after 
> sending data block (266133 bytes written): Connection timed out
> 
> This happens on a Debian Bullseye with Exim 4.94.2, nftables and 
> IPv4 Other systems with Ubuntu 18.04, Exim 4.90_1, and IPv4 only 
> and iptables are not affected. As of now I have no explanation on our 
> side for this behavior.
> 
Can you also confirm that small mails, sessions shorter than 2 seconds
have no problem?

Christian

> -- 
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Communications

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Google/gmail timeouts, IPv6 conntrack issue?

2022-02-17 Thread Evgeniy Berdnikov via Exim-users
On Thu, Feb 17, 2022 at 11:25:15AM +0300, Evgeniy Berdnikov via Exim-users 
wrote:
> > The:
> > ---
> > -A INPUT -p icmpv6 -j ACCEPT
> > -A INPUT -i bond+ -m state --state ESTABLISHED,RELATED -j ACCEPT
> > ---

 BTW, "state" conntrack module is deprecated in favor of "ctstate".

>  No rejection rules here. Look for your iptables rules to find sources
>  of rejection, then insert logging rules to debug.

-- 
 Eugene Berdnikov

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Google/gmail timeouts, IPv6 conntrack issue?

2022-02-17 Thread Kai Bojens via Exim-users

Am 16.02.22 um 08:17 schrieb Christian Balzer via Exim-users:


On the MSAs we added IPv6 and noticed this oddity when delivering large
mails (over 128KB it seems) to google/gmail via v6:


(…)

I can confirm this and have also seen this for IPv4 this morning:

H=gmail-smtp-in.l.google.com [2a00:1450:400c:c0b::1a]: SMTP timeout 
after sending data block (266346 bytes written): Connection timed out


H=gmail-smtp-in.l.google.com [142.250.147.26]: SMTP timeout after 
sending data block (266133 bytes written): Connection timed out


This happens on a Debian Bullseye with Exim 4.94.2, nftables and 
IPv4 Other systems with Ubuntu 18.04, Exim 4.90_1, and IPv4 only 
and iptables are not affected. As of now I have no explanation on our 
side for this behavior.


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] Google/gmail timeouts, IPv6 conntrack issue?

2022-02-17 Thread Evgeniy Berdnikov via Exim-users
On Thu, Feb 17, 2022 at 02:01:49PM +0900, Christian Balzer via Exim-users wrote
> If found it excruciatingly hard to correlate tcpdump and nf_conntrack
> flows,

 These data can be related via timestamps, they may be enabled for
 conntrack output:

   conntrack -o timestamp,ktimestamp -E ...

 Note that timestamping for kernel module should be enabled via option
 net.netfilter.nf_conntrack_timestamp (read man conntrack for details).

> but those ICMP6 destination unreachable packets are the result of
> the local iptables rejecting a connection to port 43922 (the originating
> outbound SMTP session from here), something it allowed for the first 2
> seconds just fine.
> 
> The:
> ---
> -A INPUT -p icmpv6 -j ACCEPT
> -A INPUT -i bond+ -m state --state ESTABLISHED,RELATED -j ACCEPT
> ---

 No rejection rules here. Look for your iptables rules to find sources
 of rejection, then insert logging rules to debug.
-- 
 Eugene Berdnikov

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/