Re: [exim] data timeout on connection

2019-10-28 Thread Niels Dettenbach via Exim-users
Am Dienstag, 22. Oktober 2019, 13:19:28 CET schrieb Hardy via Exim-users:
> I didn't change effectively anything, neither to cause nor to resolve
> the problems, and the sender sides were too many different ones as I
> would think it plausible they had a problem.
> 
> Some of you in this list suggested mis-aligned network. I suspect this
> happened on my hoster's part. They did not communicate any problem,
> though. I suspect they misconfigured and corrected silently, whatever it
> was. According to my logs this situation lasted for about 12+ hours.
I would add a +1 here for this because i did not found any further prob yet 
since weeks now, but we are "hosting" byself anything - except the BGP gates 
- with "plain internet access". i've contacted our NOC / upstream partner for 
this while he had no clue at all about this effect - so i putted this 
beside...

Possibly any proprietary) routing / network firmware of a (Tier 1?) IP 
"network device" got updated in the last?

bit crazy...

thanks  to you guys for sharing your details and the logging hints.


niels

-- 
 ---
 Niels Dettenbach
 Syndicat IT & Internet
 http://www.syndicat.com
 PGP: https://syndicat.com/pub_key.asc
 ---
 





-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-22 Thread Hardy via Exim-users

Hi all,

I just want to let you know the situation normalized "all by itself", 
and as far as I can judge no message was lost, as obviously the sender 
part considered the problem a temporary one and we were still within 
retry periods.


I didn't change effectively anything, neither to cause nor to resolve 
the problems, and the sender sides were too many different ones as I 
would think it plausible they had a problem.


Some of you in this list suggested mis-aligned network. I suspect this 
happened on my hoster's part. They did not communicate any problem, 
though. I suspect they misconfigured and corrected silently, whatever it 
was. According to my logs this situation lasted for about 12+ hours.


Thanks for all your suggestions.

Hardy


all of a sudden (after a reboot of the machine, but I cannot see a
connection to that) exim produces a lot of

data timeout on (message abandoned) on connection from mx.example.com
[IP] F=

in my logs. These are always the same systems, that retry and fail
again. Other systems don't show probs. By the looks this happens in the
rcpt or data ACL, as the F= is available in the log.

I reinstalled last week's exim.conf to cancel recent changes, but this
did not help.


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-19 Thread Hartmut Steffin via Exim-users

Slowly mails from this afternoon roll in...
Hope my excitement is not too early, but whenever I happen to learn what 
has been spooking there, I will let you know.


Thank for your head ups. Still would like to know what the hummus 
server's log had to tell about the timeout.


On 18.10.19 16:15, Cyborg wrote:

Am 18.10.19 um 15:32 schrieb Hardy via Exim-users:

Cyborg,

you mean it really may happen that "all of a sudden" my kernel is not
IP stack compatible with half of the other world?

Given, it is quite an old one, as I do not update productive systems
often, I prefer to build a new system and migrate - but not as often
then.

But again, all of a sudden incompatible with 50% in the world out there?



It does not happen with all tcp connections. It depends on
size,travelpath of the pakets, fragmentation etc etc..

So how old is your kernel ?

Best regards,
Marius





smime.p7s
Description: S/MIME Cryptographic Signature
-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-19 Thread Jeremy Harris via Exim-users
On 18/10/2019 23:29, Niels Dettenbach (Syndicat.com) via Exim-users wrote:
> - removed some DNSBL requests (shorten timing / eliminate some DNS reqs)

If there's a possibility of the receiving exim taking long enough to
cause the sender to give up (yet, in the OP's case, stop sending
data but not close the TCP connection) :-

try adding "+smtp_connection +smtp_incomplete_transaction +millisec"
to log_selector. This should give a little more info.

-- 
Cheers,
  Jeremy

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-18 Thread Niels Dettenbach (Syndicat.com) via Exim-users

> Am 18.10.2019 um 09:32 schrieb Hardy via Exim-users :
> 
> Cyborg,
> 
> you mean it really may happen that "all of a sudden" my kernel is not IP 
> stack compatible with half of the other world?

Had a similar „effect“ since 4.92* with only a few out of hundreds MTAs 
connecting to EXIM on our side in a large NOC (now firewall or similiar in 
between). Sometimes the senders get the mail even trough after several retries.

Machine is NetBSD based and worked flawless over „years“ with EXIM.

My intention was any „security“ or „firewalling“ stuff ons enders side, but 
even some more large services (with a high load of daily email) was affected.

I unconfigured / commented out some stuff (bit as shots in the dark...) like:

- acl_check_mail
- acl_check_notsmtp
- acl_check_notsmtp
- removed some DNSBL requests (shorten timing / eliminate some DNS reqs)


which seemed to „help" (as far as i can see for now). As the effect is very 
hard to replicate it is not easy to troubleshoot / dig into (for me). For me it 
seemed to as an runtime bug or similiar in exim (propably depending from the 
„environment“ in any way).

just to add here…


many thanks,


niels.

—

Niels Dettenbach
n...@syndicat.com
https://www.syndicat.com


signature.asc
Description: Message signed with OpenPGP
-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-18 Thread Evgeniy Berdnikov via Exim-users
On Fri, Oct 18, 2019 at 12:55:17PM +0200, Hardy via Exim-users wrote:
> Hi all,
> 
> all of a sudden (after a reboot of the machine, but I cannot see a
> connection to that) exim produces a lot of
> 
> data timeout on (message abandoned) on connection from mx.example.com [IP]
> F=
> 
> in my logs. These are always the same systems, that retry and fail again.
> Other systems don't show probs. By the looks this happens in the rcpt or
> data ACL, as the F= is available in the log.

 Symptoms of broken Path MTU Discovery. This is usually a misconfiguration
 of network on the sender's side. If so, small packets on the beginning
 of SMTP session are passed through, but large (after DATA) are lost,
 that's why session is died on timeout.

 Run any traffic analyzer to capture  packets from/to this [IP],
 and study the dump. 

> I reinstalled last week's exim.conf to cancel recent changes, but this did
> not help.

 For broken Path MTU Disc it should not help. But if so, there is a
 workaround: you can reduce announced MSS for this client's IP in order
 to lower sender's maximum packet size. It can be done with kernel
 packet filter. For example, Linux has a TCPMSS target for iptables,
 described in man iptables-extensions.
-- 
 Eugene Berdnikov

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-18 Thread Hardy via Exim-users

Cyborg,

you mean it really may happen that "all of a sudden" my kernel is not IP 
stack compatible with half of the other world?


Given, it is quite an old one, as I do not update productive systems 
often, I prefer to build a new system and migrate - but not as often then.


But again, all of a sudden incompatible with 50% in the world out there?




smime.p7s
Description: S/MIME Cryptographic Signature
-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-18 Thread Cyborg via Exim-users
Am 18.10.19 um 14:38 schrieb Jeremy Harris via Exim-users:
> On 18/10/2019 13:06, Hardy via Exim-users wrote:
>> And NOW:
>> 2019-10-18T13:56:03.718183+02:00 mailfass exim[4587]: SMTP data timeout
>> (message abandoned) on connection from hummus.csx.cam.ac.uk
>> [131.111.8.88] F=
>>
>> Perhaps someone from your side may have look ;-)
> Not our problem :)
>
>

Don't be so quick to reject it...

As I worked for another company, we stumpled over some wired problems.
Customer mailclients said, they have send a mail,
but the server said: the other one disconnected after xxx seconds.  same
as above.

The reason why that happend got tracked down by linux kernel devs. The
TCP IP stacks of the server and the client had a tcp window problem.
This manifested itself with both sides waiting for the otherside to sent
data. One side (windows) hat sent "QUIT", but the tcp window mismatch
made the linux tcp stack think, that nothing came. So both waited until
timeout and the mailclient thought: worked, and exim said "crap! drop it"

You correct, it's not exims problem. Linux kernel devs said: not out
fault, shit happens.. but they also said, that similar kernels should
not have this problem. This brings me to the conclusion, that one of you
may need a kernel update.

best regards,
Marius



-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-18 Thread Jeremy Harris via Exim-users
On 18/10/2019 13:06, Hardy via Exim-users wrote:
> And NOW:
> 2019-10-18T13:56:03.718183+02:00 mailfass exim[4587]: SMTP data timeout
> (message abandoned) on connection from hummus.csx.cam.ac.uk
> [131.111.8.88] F=
> 
> Perhaps someone from your side may have look ;-)

Not our problem :)

> Few MTAs still get their messages through. I was successful via telnet
> and did not see anything odd.

I suspect either you have an odd firewall interfering, or a very poor
physical connection.  Is there any obvious pattern, such as you only
get successful in-clear connections?
-- 
Cheers,
  Jeremy

-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


Re: [exim] data timeout on connection

2019-10-18 Thread Hardy via Exim-users


Update

Jeremy, I saw your post via web. I do not even use check_data, it is 
commented. In the check_rcpt I now added a condition-less "accept" VERY 
early to mitigate effects of later rules. Problem persists.


And NOW:
2019-10-18T13:56:03.718183+02:00 mailfass exim[4587]: SMTP data timeout 
(message abandoned) on connection from hummus.csx.cam.ac.uk 
[131.111.8.88] F=


Perhaps someone from your side may have look ;-)

Few MTAs still get their messages through. I was successful via telnet 
and did not see anything odd.



#

Hi all,

all of a sudden (after a reboot of the machine, but I cannot see a
connection to that) exim produces a lot of

data timeout on (message abandoned) on connection from mx.example.com 
[IP] F=


in my logs. These are always the same systems, that retry and fail 
again. Other systems don't show probs. By the looks this happens in the 
rcpt or data ACL, as the F= is available in the log.


I reinstalled last week's exim.conf to cancel recent changes, but this 
did not help.


Hope your answers come through ;-) Will follow on the Web archive.

Yours urgently
Hardy




smime.p7s
Description: S/MIME Cryptographic Signature
-- 
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


[exim] data timeout on connection

2019-10-18 Thread Hardy via Exim-users

Hi all,

all of a sudden (after a reboot of the machine, but I cannot see a
connection to that) exim produces a lot of

data timeout on (message abandoned) on connection from mx.example.com 
[IP] F=


in my logs. These are always the same systems, that retry and fail 
again. Other systems don't show probs. By the looks this happens in the 
rcpt or data ACL, as the F= is available in the log.


I reinstalled last week's exim.conf to cancel recent changes, but this 
did not help.


Hope your answers come through ;-) Will follow on the Web archive.

Yours urgently
Hardy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/