Re: AW: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-22 Thread Wagner, Patrick




We can probably get rid of the tempory debug output code, can't we?

-Patrick

On 22.02.2021 13:17, s...@acts.hu wrote:



Hello,

thank you for both of your feedbacks. As much as I dislike systemd,
it allows you to restart piler-smtp in case it crashes. But hopefully
it won't happen in the future after this fix.

Also I've merged it to the master branch, and I'll release a bugfix
version 1.3.11, and I'll make a deb package.

Janos


On 2021-02-22 12:11, claudio.schnelldasi...@manrolandgoss.com wrote:

Janos, Patrick,

thanks for this. I also run piler on Ubuntu 18.04 and ran into the 
same issue.

My 'solution' was to start piler-smtp automatically via monitoring if
it goes down :-(.

Janos, I wonder if you will provide updated .deb packages for 
dpkg-installs?


Thanks and regards,
Claudio

-Ursprüngliche Nachricht-
Von: Wagner, Patrick 
Gesendet: Montag, 22. Februar 2021 12:01
An: Piler User 
Betreff: Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV



Thanks Janos,

I'd updated and replaced the piler-smtp binary on 2021-02-16, and
while I can see that my monitoring still shows SMTP timeouts on the
piler port sometimes between 02:30 and 02:40, no further outright
crashes / SIGSEGV occur, so archiving continues without any need to
restart the process manually.


- Patrick

On 15.02.2021 21:49, s...@acts.hu wrote:



Hello Patrick,

thank you for your feedback. I'd like you to update the smtpsegv
branch, and recompile piler-smtp. I've refactored the timeout checks,
and pushed the commit to the branch on bitbucket.

Janos


On 2021-02-15 14:34, Wagner, Patrick wrote:

Hello Janos,


happened Sa->So this time.


Logs (piler-smtp as compiled out of smtpsegv.tar.gz source code) see
below, SIGSEGV follows right after free(session) done.

It is also interesting to see that the piler daemon finishes
processing a message 2 whole minutes after the smtp process has
already segfaulted and and stopped supplying any new messages.


Thanks, Patrick



Feb 14 02:29:59 hostname piler-smtp[19066]: connected from
192.168.165.154:9501 on fd=6 (active connections: 1) Feb 14 02:29:59
hostname piler-smtp[19066]: received:
Q056C70SAJJT9OFH, from=, size=4944, client=192.168.165.154, fd=6,
fsync=1831
Feb 14 02:29:59 hostname piler-smtp[19066]: disconnected from
192.168.165.154 on fd=6, slot=0, reason=done (0 active connections)
Feb 14 02:29:59 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:29:59 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:29:59 hostname piler-smtp[19066]: SSL_free() Feb 14 02:29:59
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:29:59 hostname
piler-smtp[19066]: freeing session Feb 14 02:29:59 hostname
piler-smtp[19066]: free(session) done Feb 14 02:30:00 hostname
piler[19070]: 1/Q056C70SAJJT9OFH:
400060287d2205ab9b5400229f77e67a, size=4944/1928, attachments=0,
reference=, message-id=<4DdV5z0DlSz 5vfP@mailrelay.intern>,
retention=7350, folder=0, delay=0.0425, status=stored Feb 14 02:30:01
hostname CRON[7835]: (piler) CMD (/usr/bin/find /var/www/piler/tmp
-type f -name i.\* -exec rm -f {} \;) Feb 14 02:30:01 hostname
CRON[7836]: (piler) CMD (/usr/bin/find /var/piler/error -type f|wc -l
> /var/piler/stat/error) Feb 14 02:30:01 hostname CRON[7842]: (piler)
CMD (/usr/bin/indexer --quiet note1 --rotate) Feb 14 02:30:01
hostname CRON[7837]: (piler) CMD (/usr/bin/indexer --quiet tag1
--rotate) Feb 14 02:30:01 hostname CRON[7838]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb 14 02:30:01 hostname CRON[7833]: (CRON) info (No MTA installed,
discarding output) Feb 14 02:30:01 hostname CRON[7832]: (CRON) info
(No MTA installed, discarding output) Feb 14 02:30:01 hostname piler:
INDEXER INFO: merging to main started Feb 14 02:30:40 hostname
piler-smtp[19066]: connected from
192.168.165.252:37934 on fd=6 (active connections: 1) Feb 14 02:30:41
hostname piler-smtp[19066]: connected from
192.168.165.252:37936 on fd=9 (active connections: 2) Feb 14 02:30:41
hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:30:41
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:30:41 hostname
piler-smtp[19066]: freeing session Feb 14 02:30:41 hostname
piler-smtp[19066]: free(session) done Feb 14 02:30:41 hostname
piler-smtp[19066]: disconnected from
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:30:41
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:30:41 hostname
piler-smtp[19066]: freeing session Feb 14 02:30:41 hostname
piler-smtp[19066]: free(session) done Feb 14 02:31

AW: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-22 Thread claudio.schnelldasilva
Janos, Patrick,

thanks for this. I also run piler on Ubuntu 18.04 and ran into the same issue.
My 'solution' was to start piler-smtp automatically via monitoring if it goes 
down :-(.

Janos, I wonder if you will provide updated .deb packages for dpkg-installs?

Thanks and regards,
Claudio

-Ursprüngliche Nachricht-
Von: Wagner, Patrick 
Gesendet: Montag, 22. Februar 2021 12:01
An: Piler User 
Betreff: Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV



Thanks Janos,

I'd updated and replaced the piler-smtp binary on 2021-02-16, and while I can 
see that my monitoring still shows SMTP timeouts on the piler port sometimes 
between 02:30 and 02:40, no further outright crashes / SIGSEGV occur, so 
archiving continues without any need to restart the process manually.


- Patrick

On 15.02.2021 21:49, s...@acts.hu wrote:
>
>
> Hello Patrick,
>
> thank you for your feedback. I'd like you to update the smtpsegv
> branch, and recompile piler-smtp. I've refactored the timeout checks,
> and pushed the commit to the branch on bitbucket.
>
> Janos
>
>
> On 2021-02-15 14:34, Wagner, Patrick wrote:
>> Hello Janos,
>>
>>
>> happened Sa->So this time.
>>
>>
>> Logs (piler-smtp as compiled out of smtpsegv.tar.gz source code) see
>> below, SIGSEGV follows right after free(session) done.
>>
>> It is also interesting to see that the piler daemon finishes
>> processing a message 2 whole minutes after the smtp process has
>> already segfaulted and and stopped supplying any new messages.
>>
>>
>> Thanks, Patrick
>>
>>
>>
>> Feb 14 02:29:59 hostname piler-smtp[19066]: connected from
>> 192.168.165.154:9501 on fd=6 (active connections: 1) Feb 14 02:29:59
>> hostname piler-smtp[19066]: received:
>> Q056C70SAJJT9OFH, from=, size=4944, client=192.168.165.154, fd=6,
>> fsync=1831
>> Feb 14 02:29:59 hostname piler-smtp[19066]: disconnected from
>> 192.168.165.154 on fd=6, slot=0, reason=done (0 active connections)
>> Feb 14 02:29:59 hostname piler-smtp[19066]: free_smtp_session() Feb
>> 14 02:29:59 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
>> 02:29:59 hostname piler-smtp[19066]: SSL_free() Feb 14 02:29:59
>> hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:29:59 hostname
>> piler-smtp[19066]: freeing session Feb 14 02:29:59 hostname
>> piler-smtp[19066]: free(session) done Feb 14 02:30:00 hostname
>> piler[19070]: 1/Q056C70SAJJT9OFH:
>> 400060287d2205ab9b5400229f77e67a, size=4944/1928, attachments=0,
>> reference=, message-id=<4DdV5z0DlSz 5vfP@mailrelay.intern>,
>> retention=7350, folder=0, delay=0.0425, status=stored Feb 14 02:30:01
>> hostname CRON[7835]: (piler) CMD (/usr/bin/find /var/www/piler/tmp
>> -type f -name i.\* -exec rm -f {} \;) Feb 14 02:30:01 hostname
>> CRON[7836]: (piler) CMD (/usr/bin/find /var/piler/error -type f|wc -l
>> > /var/piler/stat/error) Feb 14 02:30:01 hostname CRON[7842]: (piler)
>> CMD (/usr/bin/indexer --quiet note1 --rotate) Feb 14 02:30:01
>> hostname CRON[7837]: (piler) CMD (/usr/bin/indexer --quiet tag1
>> --rotate) Feb 14 02:30:01 hostname CRON[7838]: (piler) CMD
>> (/usr/local/libexec/piler/indexer.main.sh)
>> Feb 14 02:30:01 hostname CRON[7833]: (CRON) info (No MTA installed,
>> discarding output) Feb 14 02:30:01 hostname CRON[7832]: (CRON) info
>> (No MTA installed, discarding output) Feb 14 02:30:01 hostname piler:
>> INDEXER INFO: merging to main started Feb 14 02:30:40 hostname
>> piler-smtp[19066]: connected from
>> 192.168.165.252:37934 on fd=6 (active connections: 1) Feb 14 02:30:41
>> hostname piler-smtp[19066]: connected from
>> 192.168.165.252:37936 on fd=9 (active connections: 2) Feb 14 02:30:41
>> hostname piler-smtp[19066]: disconnected from
>> 192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
>> Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
>> 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
>> 02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:30:41
>> hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:30:41 hostname
>> piler-smtp[19066]: freeing session Feb 14 02:30:41 hostname
>> piler-smtp[19066]: free(session) done Feb 14 02:30:41 hostname
>> piler-smtp[19066]: disconnected from
>> 192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
>> Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
>> 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
>> 02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:30:41
>> hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:30:41 hostname
>> piler-smtp[1

Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-22 Thread Wagner, Patrick




Thanks Janos,

I'd updated and replaced the piler-smtp binary on 2021-02-16, and while 
I can see that my monitoring still shows SMTP timeouts on the piler port 
sometimes between 02:30 and 02:40, no further outright crashes / SIGSEGV 
occur, so archiving continues without any need to restart the process 
manually.



- Patrick

On 15.02.2021 21:49, s...@acts.hu wrote:



Hello Patrick,

thank you for your feedback. I'd like you to update the smtpsegv branch,
and recompile piler-smtp. I've refactored the timeout checks, and pushed
the commit to the branch on bitbucket.

Janos


On 2021-02-15 14:34, Wagner, Patrick wrote:

Hello Janos,


happened Sa->So this time.


Logs (piler-smtp as compiled out of smtpsegv.tar.gz source code) see
below, SIGSEGV follows right after free(session) done.

It is also interesting to see that the piler daemon finishes
processing a message 2 whole minutes after the smtp process has
already segfaulted and and stopped supplying any new messages.


Thanks, Patrick



Feb 14 02:29:59 hostname piler-smtp[19066]: connected from
192.168.165.154:9501 on fd=6 (active connections: 1)
Feb 14 02:29:59 hostname piler-smtp[19066]: received:
Q056C70SAJJT9OFH, from=, size=4944, client=192.168.165.154, fd=6,
fsync=1831
Feb 14 02:29:59 hostname piler-smtp[19066]: disconnected from
192.168.165.154 on fd=6, slot=0, reason=done (0 active connections)
Feb 14 02:29:59 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:29:59 hostname piler-smtp[19066]: freeing session
Feb 14 02:29:59 hostname piler-smtp[19066]: free(session) done
Feb 14 02:30:00 hostname piler[19070]: 1/Q056C70SAJJT9OFH:
400060287d2205ab9b5400229f77e67a, size=4944/1928, attachments=0,
reference=, message-id=<4DdV5z0DlSz
5vfP@mailrelay.intern>, retention=7350, folder=0, delay=0.0425, 
status=stored

Feb 14 02:30:01 hostname CRON[7835]: (piler) CMD (/usr/bin/find
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb 14 02:30:01 hostname CRON[7836]: (piler) CMD (/usr/bin/find
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb 14 02:30:01 hostname CRON[7842]: (piler) CMD (/usr/bin/indexer
--quiet note1 --rotate)
Feb 14 02:30:01 hostname CRON[7837]: (piler) CMD (/usr/bin/indexer
--quiet tag1 --rotate)
Feb 14 02:30:01 hostname CRON[7838]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb 14 02:30:01 hostname CRON[7833]: (CRON) info (No MTA installed,
discarding output)
Feb 14 02:30:01 hostname CRON[7832]: (CRON) info (No MTA installed,
discarding output)
Feb 14 02:30:01 hostname piler: INDEXER INFO: merging to main started
Feb 14 02:30:40 hostname piler-smtp[19066]: connected from
192.168.165.252:37934 on fd=6 (active connections: 1)
Feb 14 02:30:41 hostname piler-smtp[19066]: connected from
192.168.165.252:37936 on fd=9 (active connections: 2)
Feb 14 02:30:41 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:30:41 hostname piler-smtp[19066]: freeing session
Feb 14 02:30:41 hostname piler-smtp[19066]: free(session) done
Feb 14 02:30:41 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:30:41 hostname piler-smtp[19066]: freeing session
Feb 14 02:30:41 hostname piler-smtp[19066]: free(session) done
Feb 14 02:31:44 hostname piler-smtp[19066]: connected from
192.168.165.252:40584 on fd=6 (active connections: 1)
Feb 14 02:31:53 hostname piler-smtp[19066]: connected from
192.168.165.252:40586 on fd=9 (active connections: 2)
Feb 14 02:31:53 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
Feb 14 02:31:53 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:31:53 hostname piler-smtp[19066]: freeing session
Feb 14 02:31:53 hostname piler-smtp[19066]: free(session) done
Feb 14 02:31:53 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
Feb 14 02:31:53 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:31:53 hostname piler

Re: AW: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-22 Thread sj




On 2021-02-22 14:19, Wagner, Patrick wrote:

We can probably get rid of the tempory debug output code, can't we?


Yes. Either remove them for yourself or get the master branch,
and recompile, then overwrite the piler-smtp binary.

Janos


On 22.02.2021 13:17, s...@acts.hu wrote:



Hello,

thank you for both of your feedbacks. As much as I dislike systemd,
it allows you to restart piler-smtp in case it crashes. But hopefully
it won't happen in the future after this fix.

Also I've merged it to the master branch, and I'll release a bugfix
version 1.3.11, and I'll make a deb package.

Janos


On 2021-02-22 12:11, claudio.schnelldasi...@manrolandgoss.com wrote:

Janos, Patrick,

thanks for this. I also run piler on Ubuntu 18.04 and ran into the 
same issue.

My 'solution' was to start piler-smtp automatically via monitoring if
it goes down :-(.

Janos, I wonder if you will provide updated .deb packages for 
dpkg-installs?


Thanks and regards,
Claudio

-Ursprüngliche Nachricht-
Von: Wagner, Patrick 
Gesendet: Montag, 22. Februar 2021 12:01
An: Piler User 
Betreff: Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV



Thanks Janos,

I'd updated and replaced the piler-smtp binary on 2021-02-16, and
while I can see that my monitoring still shows SMTP timeouts on the
piler port sometimes between 02:30 and 02:40, no further outright
crashes / SIGSEGV occur, so archiving continues without any need to
restart the process manually.


- Patrick

On 15.02.2021 21:49, s...@acts.hu wrote:



Hello Patrick,

thank you for your feedback. I'd like you to update the smtpsegv
branch, and recompile piler-smtp. I've refactored the timeout 
checks,

and pushed the commit to the branch on bitbucket.

Janos


On 2021-02-15 14:34, Wagner, Patrick wrote:

Hello Janos,


happened Sa->So this time.


Logs (piler-smtp as compiled out of smtpsegv.tar.gz source code) 
see

below, SIGSEGV follows right after free(session) done.

It is also interesting to see that the piler daemon finishes
processing a message 2 whole minutes after the smtp process has
already segfaulted and and stopped supplying any new messages.


Thanks, Patrick



Feb 14 02:29:59 hostname piler-smtp[19066]: connected from
192.168.165.154:9501 on fd=6 (active connections: 1) Feb 14 
02:29:59

hostname piler-smtp[19066]: received:
Q056C70SAJJT9OFH, from=, size=4944, client=192.168.165.154, fd=6,
fsync=1831
Feb 14 02:29:59 hostname piler-smtp[19066]: disconnected from
192.168.165.154 on fd=6, slot=0, reason=done (0 active connections)
Feb 14 02:29:59 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:29:59 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:29:59 hostname piler-smtp[19066]: SSL_free() Feb 14 02:29:59
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:29:59 hostname
piler-smtp[19066]: freeing session Feb 14 02:29:59 hostname
piler-smtp[19066]: free(session) done Feb 14 02:30:00 hostname
piler[19070]: 1/Q056C70SAJJT9OFH:
400060287d2205ab9b5400229f77e67a, size=4944/1928, 
attachments=0,

reference=, message-id=<4DdV5z0DlSz 5vfP@mailrelay.intern>,
retention=7350, folder=0, delay=0.0425, status=stored Feb 14 
02:30:01

hostname CRON[7835]: (piler) CMD (/usr/bin/find /var/www/piler/tmp
-type f -name i.\* -exec rm -f {} \;) Feb 14 02:30:01 hostname
CRON[7836]: (piler) CMD (/usr/bin/find /var/piler/error -type f|wc 
-l

> /var/piler/stat/error) Feb 14 02:30:01 hostname CRON[7842]: (piler)
CMD (/usr/bin/indexer --quiet note1 --rotate) Feb 14 02:30:01
hostname CRON[7837]: (piler) CMD (/usr/bin/indexer --quiet tag1
--rotate) Feb 14 02:30:01 hostname CRON[7838]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb 14 02:30:01 hostname CRON[7833]: (CRON) info (No MTA installed,
discarding output) Feb 14 02:30:01 hostname CRON[7832]: (CRON) info
(No MTA installed, discarding output) Feb 14 02:30:01 hostname 
piler:

INDEXER INFO: merging to main started Feb 14 02:30:40 hostname
piler-smtp[19066]: connected from
192.168.165.252:37934 on fd=6 (active connections: 1) Feb 14 
02:30:41

hostname piler-smtp[19066]: connected from
192.168.165.252:37936 on fd=9 (active connections: 2) Feb 14 
02:30:41

hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:30:41
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:30:41 hostname
piler-smtp[19066]: freeing session Feb 14 02:30:41 hostname
piler-smtp[19066]: free(session) done Feb 14 02:30:41 hostname
piler-smtp[19066]: disconnected from
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:

Re: AW: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-22 Thread sj




Hello,

thank you for both of your feedbacks. As much as I dislike systemd,
it allows you to restart piler-smtp in case it crashes. But hopefully
it won't happen in the future after this fix.

Also I've merged it to the master branch, and I'll release a bugfix
version 1.3.11, and I'll make a deb package.

Janos


On 2021-02-22 12:11, claudio.schnelldasi...@manrolandgoss.com wrote:

Janos, Patrick,

thanks for this. I also run piler on Ubuntu 18.04 and ran into the same 
issue.

My 'solution' was to start piler-smtp automatically via monitoring if
it goes down :-(.

Janos, I wonder if you will provide updated .deb packages for 
dpkg-installs?


Thanks and regards,
Claudio

-Ursprüngliche Nachricht-
Von: Wagner, Patrick 
Gesendet: Montag, 22. Februar 2021 12:01
An: Piler User 
Betreff: Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV



Thanks Janos,

I'd updated and replaced the piler-smtp binary on 2021-02-16, and
while I can see that my monitoring still shows SMTP timeouts on the
piler port sometimes between 02:30 and 02:40, no further outright
crashes / SIGSEGV occur, so archiving continues without any need to
restart the process manually.


- Patrick

On 15.02.2021 21:49, s...@acts.hu wrote:



Hello Patrick,

thank you for your feedback. I'd like you to update the smtpsegv
branch, and recompile piler-smtp. I've refactored the timeout checks,
and pushed the commit to the branch on bitbucket.

Janos


On 2021-02-15 14:34, Wagner, Patrick wrote:

Hello Janos,


happened Sa->So this time.


Logs (piler-smtp as compiled out of smtpsegv.tar.gz source code) see
below, SIGSEGV follows right after free(session) done.

It is also interesting to see that the piler daemon finishes
processing a message 2 whole minutes after the smtp process has
already segfaulted and and stopped supplying any new messages.


Thanks, Patrick



Feb 14 02:29:59 hostname piler-smtp[19066]: connected from
192.168.165.154:9501 on fd=6 (active connections: 1) Feb 14 02:29:59
hostname piler-smtp[19066]: received:
Q056C70SAJJT9OFH, from=, size=4944, client=192.168.165.154, fd=6,
fsync=1831
Feb 14 02:29:59 hostname piler-smtp[19066]: disconnected from
192.168.165.154 on fd=6, slot=0, reason=done (0 active connections)
Feb 14 02:29:59 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:29:59 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:29:59 hostname piler-smtp[19066]: SSL_free() Feb 14 02:29:59
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:29:59 hostname
piler-smtp[19066]: freeing session Feb 14 02:29:59 hostname
piler-smtp[19066]: free(session) done Feb 14 02:30:00 hostname
piler[19070]: 1/Q056C70SAJJT9OFH:
400060287d2205ab9b5400229f77e67a, size=4944/1928, attachments=0,
reference=, message-id=<4DdV5z0DlSz 5vfP@mailrelay.intern>,
retention=7350, folder=0, delay=0.0425, status=stored Feb 14 02:30:01
hostname CRON[7835]: (piler) CMD (/usr/bin/find /var/www/piler/tmp
-type f -name i.\* -exec rm -f {} \;) Feb 14 02:30:01 hostname
CRON[7836]: (piler) CMD (/usr/bin/find /var/piler/error -type f|wc -l
> /var/piler/stat/error) Feb 14 02:30:01 hostname CRON[7842]: (piler)
CMD (/usr/bin/indexer --quiet note1 --rotate) Feb 14 02:30:01
hostname CRON[7837]: (piler) CMD (/usr/bin/indexer --quiet tag1
--rotate) Feb 14 02:30:01 hostname CRON[7838]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb 14 02:30:01 hostname CRON[7833]: (CRON) info (No MTA installed,
discarding output) Feb 14 02:30:01 hostname CRON[7832]: (CRON) info
(No MTA installed, discarding output) Feb 14 02:30:01 hostname piler:
INDEXER INFO: merging to main started Feb 14 02:30:40 hostname
piler-smtp[19066]: connected from
192.168.165.252:37934 on fd=6 (active connections: 1) Feb 14 02:30:41
hostname piler-smtp[19066]: connected from
192.168.165.252:37936 on fd=9 (active connections: 2) Feb 14 02:30:41
hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:30:41
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:30:41 hostname
piler-smtp[19066]: freeing session Feb 14 02:30:41 hostname
piler-smtp[19066]: free(session) done Feb 14 02:30:41 hostname
piler-smtp[19066]: disconnected from
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session() Feb
14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown() Feb 14
02:30:41 hostname piler-smtp[19066]: SSL_free() Feb 14 02:30:41
hostname piler-smtp[19066]: SSL_CTX_free Feb 14 02:30:41 hostname
piler-smtp[19066]: freeing session Feb 14 02:30:41 hostname
piler-smtp[19066]: free(session) done Feb 14 02:31:44 hostname
piler-smtp[19066]: connected from
192.168.165.252:40584 on fd=6 (active connections: 1) Feb 14 02:31:53
hostname pile

Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-15 Thread sj




Hello Patrick,

thank you for your feedback. I'd like you to update the smtpsegv branch,
and recompile piler-smtp. I've refactored the timeout checks, and pushed
the commit to the branch on bitbucket.

Janos


On 2021-02-15 14:34, Wagner, Patrick wrote:

Hello Janos,


happened Sa->So this time.


Logs (piler-smtp as compiled out of smtpsegv.tar.gz source code) see
below, SIGSEGV follows right after free(session) done.

It is also interesting to see that the piler daemon finishes
processing a message 2 whole minutes after the smtp process has
already segfaulted and and stopped supplying any new messages.


Thanks, Patrick



Feb 14 02:29:59 hostname piler-smtp[19066]: connected from
192.168.165.154:9501 on fd=6 (active connections: 1)
Feb 14 02:29:59 hostname piler-smtp[19066]: received:
Q056C70SAJJT9OFH, from=, size=4944, client=192.168.165.154, fd=6,
fsync=1831
Feb 14 02:29:59 hostname piler-smtp[19066]: disconnected from
192.168.165.154 on fd=6, slot=0, reason=done (0 active connections)
Feb 14 02:29:59 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:29:59 hostname piler-smtp[19066]: freeing session
Feb 14 02:29:59 hostname piler-smtp[19066]: free(session) done
Feb 14 02:30:00 hostname piler[19070]: 1/Q056C70SAJJT9OFH:
400060287d2205ab9b5400229f77e67a, size=4944/1928, attachments=0,
reference=, message-id=<4DdV5z0DlSz
5vfP@mailrelay.intern>, retention=7350, folder=0, delay=0.0425, 
status=stored

Feb 14 02:30:01 hostname CRON[7835]: (piler) CMD (/usr/bin/find
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb 14 02:30:01 hostname CRON[7836]: (piler) CMD (/usr/bin/find
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb 14 02:30:01 hostname CRON[7842]: (piler) CMD (/usr/bin/indexer
--quiet note1 --rotate)
Feb 14 02:30:01 hostname CRON[7837]: (piler) CMD (/usr/bin/indexer
--quiet tag1 --rotate)
Feb 14 02:30:01 hostname CRON[7838]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb 14 02:30:01 hostname CRON[7833]: (CRON) info (No MTA installed,
discarding output)
Feb 14 02:30:01 hostname CRON[7832]: (CRON) info (No MTA installed,
discarding output)
Feb 14 02:30:01 hostname piler: INDEXER INFO: merging to main started
Feb 14 02:30:40 hostname piler-smtp[19066]: connected from
192.168.165.252:37934 on fd=6 (active connections: 1)
Feb 14 02:30:41 hostname piler-smtp[19066]: connected from
192.168.165.252:37936 on fd=9 (active connections: 2)
Feb 14 02:30:41 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:30:41 hostname piler-smtp[19066]: freeing session
Feb 14 02:30:41 hostname piler-smtp[19066]: free(session) done
Feb 14 02:30:41 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:30:41 hostname piler-smtp[19066]: freeing session
Feb 14 02:30:41 hostname piler-smtp[19066]: free(session) done
Feb 14 02:31:44 hostname piler-smtp[19066]: connected from
192.168.165.252:40584 on fd=6 (active connections: 1)
Feb 14 02:31:53 hostname piler-smtp[19066]: connected from
192.168.165.252:40586 on fd=9 (active connections: 2)
Feb 14 02:31:53 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)
Feb 14 02:31:53 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:31:53 hostname piler-smtp[19066]: freeing session
Feb 14 02:31:53 hostname piler-smtp[19066]: free(session) done
Feb 14 02:31:53 hostname piler-smtp[19066]: disconnected from
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)
Feb 14 02:31:53 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:31:53 hostname piler-smtp[19066]: freeing session
Feb 14 02:31:53 hostname piler-smtp[19066]: free(session) done
Feb 14 02:32:40 hostname piler-smtp[19066]: connected from
192.168.165.252:43330 on fd=6 (active connections: 1)
Feb 14 02:32:40 hostname piler-smtp[19066]: connected from
192.168.165.252:43334 on fd=9 (active connections: 2)

Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-15 Thread Wagner, Patrick




Hello Janos,


happened Sa->So this time.


Logs (piler-smtp as compiled out of smtpsegv.tar.gz source code) see 
below, SIGSEGV follows right after free(session) done.


It is also interesting to see that the piler daemon finishes processing 
a message 2 whole minutes after the smtp process has already segfaulted 
and and stopped supplying any new messages.



Thanks, Patrick



Feb 14 02:29:59 hostname piler-smtp[19066]: connected from 
192.168.165.154:9501 on fd=6 (active connections: 1)
Feb 14 02:29:59 hostname piler-smtp[19066]: received: Q056C70SAJJT9OFH, 
from=, size=4944, client=192.168.165.154, fd=6, fsync=1831
Feb 14 02:29:59 hostname piler-smtp[19066]: disconnected from 
192.168.165.154 on fd=6, slot=0, reason=done (0 active connections)

Feb 14 02:29:59 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:29:59 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:29:59 hostname piler-smtp[19066]: freeing session
Feb 14 02:29:59 hostname piler-smtp[19066]: free(session) done
Feb 14 02:30:00 hostname piler[19070]: 1/Q056C70SAJJT9OFH: 
400060287d2205ab9b5400229f77e67a, size=4944/1928, attachments=0, 
reference=, message-id=<4DdV5z0DlSz
5vfP@mailrelay.intern>, retention=7350, folder=0, delay=0.0425, 
status=stored
Feb 14 02:30:01 hostname CRON[7835]: (piler) CMD (/usr/bin/find 
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb 14 02:30:01 hostname CRON[7836]: (piler) CMD (/usr/bin/find 
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb 14 02:30:01 hostname CRON[7842]: (piler) CMD (/usr/bin/indexer 
--quiet note1 --rotate)
Feb 14 02:30:01 hostname CRON[7837]: (piler) CMD (/usr/bin/indexer 
--quiet tag1 --rotate)
Feb 14 02:30:01 hostname CRON[7838]: (piler) CMD 
(/usr/local/libexec/piler/indexer.main.sh)
Feb 14 02:30:01 hostname CRON[7833]: (CRON) info (No MTA installed, 
discarding output)
Feb 14 02:30:01 hostname CRON[7832]: (CRON) info (No MTA installed, 
discarding output)

Feb 14 02:30:01 hostname piler: INDEXER INFO: merging to main started
Feb 14 02:30:40 hostname piler-smtp[19066]: connected from 
192.168.165.252:37934 on fd=6 (active connections: 1)
Feb 14 02:30:41 hostname piler-smtp[19066]: connected from 
192.168.165.252:37936 on fd=9 (active connections: 2)
Feb 14 02:30:41 hostname piler-smtp[19066]: disconnected from 
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)

Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:30:41 hostname piler-smtp[19066]: freeing session
Feb 14 02:30:41 hostname piler-smtp[19066]: free(session) done
Feb 14 02:30:41 hostname piler-smtp[19066]: disconnected from 
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)

Feb 14 02:30:41 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:30:41 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:30:41 hostname piler-smtp[19066]: freeing session
Feb 14 02:30:41 hostname piler-smtp[19066]: free(session) done
Feb 14 02:31:44 hostname piler-smtp[19066]: connected from 
192.168.165.252:40584 on fd=6 (active connections: 1)
Feb 14 02:31:53 hostname piler-smtp[19066]: connected from 
192.168.165.252:40586 on fd=9 (active connections: 2)
Feb 14 02:31:53 hostname piler-smtp[19066]: disconnected from 
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)

Feb 14 02:31:53 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:31:53 hostname piler-smtp[19066]: freeing session
Feb 14 02:31:53 hostname piler-smtp[19066]: free(session) done
Feb 14 02:31:53 hostname piler-smtp[19066]: disconnected from 
192.168.165.252 on fd=9, slot=1, reason=done (0 active connections)

Feb 14 02:31:53 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_shutdown()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_free()
Feb 14 02:31:53 hostname piler-smtp[19066]: SSL_CTX_free
Feb 14 02:31:53 hostname piler-smtp[19066]: freeing session
Feb 14 02:31:53 hostname piler-smtp[19066]: free(session) done
Feb 14 02:32:40 hostname piler-smtp[19066]: connected from 
192.168.165.252:43330 on fd=6 (active connections: 1)
Feb 14 02:32:40 hostname piler-smtp[19066]: connected from 
192.168.165.252:43334 on fd=9 (active connections: 2)
Feb 14 02:32:40 hostname piler-smtp[19066]: disconnected from 
192.168.165.252 on fd=6, slot=0, reason=done (1 active connections)

Feb 14 02:32:40 hostname piler-smtp[19066]: free_smtp_session()
Feb 14 02:32:40 hostname p

Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-10 Thread Wagner, Patrick




Hello Janos,


happened again, the same way, with the recompiled piler-smtp binary (git 
master).


~# piler-smtp -V
1.3.10 build 998

Feb 10 02:30:01 hostname CRON[8429]: (piler) CMD (/usr/bin/indexer 
--quiet note1 --rotate)
Feb 10 02:30:01 hostname CRON[8427]: (piler) CMD (/usr/bin/find 
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb 10 02:30:01 hostname CRON[8428]: (piler) CMD (/usr/bin/find 
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb 10 02:30:01 hostname CRON[8431]: (piler) CMD 
(/usr/local/libexec/piler/indexer.main.sh)
Feb 10 02:30:01 hostname CRON[8430]: (piler) CMD (/usr/bin/indexer 
--quiet tag1 --rotate)
Feb 10 02:30:01 hostname CRON[8424]: (CRON) info (No MTA installed, 
discarding output)
Feb 10 02:30:01 hostname CRON[8425]: (CRON) info (No MTA installed, 
discarding output)

Feb 10 02:30:01 hostname piler: INDEXER INFO: merging to main started
Feb 10 02:30:17 hostname piler-smtp[27596]: connected from 
192.168.165.252:57902 on fd=6 (active connections: 1)
Feb 10 02:30:17 hostname piler-smtp[27596]: connected from 
192.168.165.252:57908 on fd=9 (active connections: 2)
Feb 10 02:30:17 hostname piler-smtp[27596]: disconnected from 
192.168.165.252 on fd=6, slot=0 (1 active connections)
Feb 10 02:30:17 hostname piler-smtp[27596]: disconnected from 
192.168.165.252 on fd=9, slot=1 (0 active connections)
Feb 10 02:30:31 hostname piler-smtp[27596]: connected from 
192.168.165.154:32120 on fd=6 (active connections: 1)
Feb 10 02:31:47 hostname piler-smtp[27596]: received: ZQAHOOJ8RXM5NCHD, 
from=, size=4945, client=192.168.165.154, fd=6, fsync=75881359
Feb 10 02:31:47 hostname piler-smtp[27596]: client 192.168.165.154 
timeout, lasttime: 1612920631
Feb 10 02:31:47 hostname piler-smtp[27596]: disconnected from 
192.168.165.154 on fd=6, slot=0 (0 active connections)
Feb 10 02:31:47 hostname piler-smtp[27596]: got signal: 11, Segmentation 
fault

Feb 10 02:31:47 hostname piler-smtp[27596]: piler-smtp has been terminated
Feb 10 02:31:48 hostname piler[27599]: 0/ZQAHOOJ8RXM5NCHD: 
40006023378d2c9c99d400a51d237bc0, size=4945/1936, attachments=0, 
reference=, message-id=<4Db2JR32vQz5vS9@mailrelay.intern>, 
retention=7350, folder=0, delay=0.9705, status=stored


There were no changes to backup schedules recently, and no scheduled job 
or I/O heavy maintenance should happen, especially not at this point in 
time. Still, point taken, we will check whether there could be anything, 
but why would only 1-2 VMs out of hundreds be affected?


In any case, SIGSEGV still happens and my piler-smtp process is dead.

- Patrick

On 09.02.2021 09:04, s...@acts.hu wrote:



Hello Patrick,

I've been reported such issues when a backup system paused or froze
the virtual machine running piler. I suspect that it might be the
case for you based on the time and days you mentioned.

The problem in such cases is that piler-smtp is in the middle of
processing an email, then the system is frozen for a few minutes
enough to pass the smtp timeout limit (60 sec), and then when it's
unfrozen then piler-smtp notices the timeout situation, then tears
down the connection. However, at the end of the session it tries
to tear down the already terminated and freed connection data.

So, let's try the following: get the master branch, recompile piler,
however replace the piler-smtp binary only, nothing else. Then let's
see if the problem is solved.

Janos

On 2021-02-09 07:56, Wagner, Patrick wrote:

Hello everyone,

a few weeks ago 2 of our Piler instances began segfaulting - more
precisely, several times per month (but not every day night) at about
02:30 am local time the MS Exchange server that has the Piler instance
setup as journaling target connects to piler-smtp, upon which
piler-smtp segfaults after trying to store an email that clearly
doesn't get parsed correctly by piler-smtp as it is represented as
gibberish (and gibberish cannot be renamed...)

See below for a few of the latest examples. Interestingly, the SIGSEGV
did not happen Sunday and Monday night, meaning the day following a
non-weekday? Not going on long enough that I could call it a pattern,
but in any case it would be nice if piler-smtp didn't segfault in the
first place.

What can I do to supply more useful information to get this fixed? Or
has there been meaningful work on piler-smtp in the meantime that this
could / would've already been fixed?


Thanks, Patrick


Feb  4 02:30:01 hostname CRON[24428]: (piler) CMD (/usr/bin/find
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb  4 02:30:01 hostname CRON[24430]: (piler) CMD (/usr/bin/find
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb  4 02:30:01 hostname CRON[24429]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb  4 02:30:01 hostname CRON[24431]: (piler) CMD (/usr/bin/indexer
--quiet note1 --rotate)
Feb  4 02:30:01 hostname CRON[24436]: (piler) CMD (/usr/bin/indexer
--quiet tag1 --rotate)
Feb  4 02:30:01 hostname piler: INDEXER INFO: merging to main

Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-10 Thread sj




Hello Patrick,

then we need to dig deeper to understand what's going on. I created
a new branch to have some extra debugging. Please check out this commit:
https://bitbucket.org/jsuto/piler/commits/ac434e73ef86c031ee451284a00fc7062c579f9c

Perhaps it's easier to get 
https://bitbucket.org/jsuto/piler/get/smtpsegv.tar.gz

compile it, and overwrite only the piler-smtp binary.

Janos

On 2021-02-10 17:41, Wagner, Patrick wrote:

Hello Janos,


happened again, the same way, with the recompiled piler-smtp binary
(git master).

~# piler-smtp -V
1.3.10 build 998

Feb 10 02:30:01 hostname CRON[8429]: (piler) CMD (/usr/bin/indexer
--quiet note1 --rotate)
Feb 10 02:30:01 hostname CRON[8427]: (piler) CMD (/usr/bin/find
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb 10 02:30:01 hostname CRON[8428]: (piler) CMD (/usr/bin/find
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb 10 02:30:01 hostname CRON[8431]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb 10 02:30:01 hostname CRON[8430]: (piler) CMD (/usr/bin/indexer
--quiet tag1 --rotate)
Feb 10 02:30:01 hostname CRON[8424]: (CRON) info (No MTA installed,
discarding output)
Feb 10 02:30:01 hostname CRON[8425]: (CRON) info (No MTA installed,
discarding output)
Feb 10 02:30:01 hostname piler: INDEXER INFO: merging to main started
Feb 10 02:30:17 hostname piler-smtp[27596]: connected from
192.168.165.252:57902 on fd=6 (active connections: 1)
Feb 10 02:30:17 hostname piler-smtp[27596]: connected from
192.168.165.252:57908 on fd=9 (active connections: 2)
Feb 10 02:30:17 hostname piler-smtp[27596]: disconnected from
192.168.165.252 on fd=6, slot=0 (1 active connections)
Feb 10 02:30:17 hostname piler-smtp[27596]: disconnected from
192.168.165.252 on fd=9, slot=1 (0 active connections)
Feb 10 02:30:31 hostname piler-smtp[27596]: connected from
192.168.165.154:32120 on fd=6 (active connections: 1)
Feb 10 02:31:47 hostname piler-smtp[27596]: received:
ZQAHOOJ8RXM5NCHD, from=, size=4945, client=192.168.165.154, fd=6,
fsync=75881359
Feb 10 02:31:47 hostname piler-smtp[27596]: client 192.168.165.154
timeout, lasttime: 1612920631
Feb 10 02:31:47 hostname piler-smtp[27596]: disconnected from
192.168.165.154 on fd=6, slot=0 (0 active connections)
Feb 10 02:31:47 hostname piler-smtp[27596]: got signal: 11, 
Segmentation fault
Feb 10 02:31:47 hostname piler-smtp[27596]: piler-smtp has been 
terminated

Feb 10 02:31:48 hostname piler[27599]: 0/ZQAHOOJ8RXM5NCHD:
40006023378d2c9c99d400a51d237bc0, size=4945/1936, attachments=0,
reference=, message-id=<4Db2JR32vQz5vS9@mailrelay.intern>,
retention=7350, folder=0, delay=0.9705, status=stored

There were no changes to backup schedules recently, and no scheduled
job or I/O heavy maintenance should happen, especially not at this
point in time. Still, point taken, we will check whether there could
be anything, but why would only 1-2 VMs out of hundreds be affected?

In any case, SIGSEGV still happens and my piler-smtp process is dead.

- Patrick

On 09.02.2021 09:04, s...@acts.hu wrote:



Hello Patrick,

I've been reported such issues when a backup system paused or froze
the virtual machine running piler. I suspect that it might be the
case for you based on the time and days you mentioned.

The problem in such cases is that piler-smtp is in the middle of
processing an email, then the system is frozen for a few minutes
enough to pass the smtp timeout limit (60 sec), and then when it's
unfrozen then piler-smtp notices the timeout situation, then tears
down the connection. However, at the end of the session it tries
to tear down the already terminated and freed connection data.

So, let's try the following: get the master branch, recompile piler,
however replace the piler-smtp binary only, nothing else. Then let's
see if the problem is solved.

Janos

On 2021-02-09 07:56, Wagner, Patrick wrote:

Hello everyone,

a few weeks ago 2 of our Piler instances began segfaulting - more
precisely, several times per month (but not every day night) at about
02:30 am local time the MS Exchange server that has the Piler 
instance

setup as journaling target connects to piler-smtp, upon which
piler-smtp segfaults after trying to store an email that clearly
doesn't get parsed correctly by piler-smtp as it is represented as
gibberish (and gibberish cannot be renamed...)

See below for a few of the latest examples. Interestingly, the 
SIGSEGV

did not happen Sunday and Monday night, meaning the day following a
non-weekday? Not going on long enough that I could call it a pattern,
but in any case it would be nice if piler-smtp didn't segfault in the
first place.

What can I do to supply more useful information to get this fixed? Or
has there been meaningful work on piler-smtp in the meantime that 
this

could / would've already been fixed?


Thanks, Patrick


Feb  4 02:30:01 hostname CRON[24428]: (piler) CMD (/usr/bin/find
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb  4 02:30:01 hostname CRON[24430]: (piler) 

Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-09 Thread Wagner, Patrick




Hello everyone,

a few weeks ago 2 of our Piler instances began segfaulting - more 
precisely, several times per month (but not every day night) at about 
02:30 am local time the MS Exchange server that has the Piler instance 
setup as journaling target connects to piler-smtp, upon which piler-smtp 
segfaults after trying to store an email that clearly doesn't get parsed 
correctly by piler-smtp as it is represented as gibberish (and gibberish 
cannot be renamed...)


See below for a few of the latest examples. Interestingly, the SIGSEGV 
did not happen Sunday and Monday night, meaning the day following a 
non-weekday? Not going on long enough that I could call it a pattern, 
but in any case it would be nice if piler-smtp didn't segfault in the 
first place.


What can I do to supply more useful information to get this fixed? Or 
has there been meaningful work on piler-smtp in the meantime that this 
could / would've already been fixed?



Thanks, Patrick


Feb  4 02:30:01 hostname CRON[24428]: (piler) CMD (/usr/bin/find 
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb  4 02:30:01 hostname CRON[24430]: (piler) CMD (/usr/bin/find 
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb  4 02:30:01 hostname CRON[24429]: (piler) CMD 
(/usr/local/libexec/piler/indexer.main.sh)
Feb  4 02:30:01 hostname CRON[24431]: (piler) CMD (/usr/bin/indexer 
--quiet note1 --rotate)
Feb  4 02:30:01 hostname CRON[24436]: (piler) CMD (/usr/bin/indexer 
--quiet tag1 --rotate)

Feb  4 02:30:01 hostname piler: INDEXER INFO: merging to main started
Feb  4 02:30:01 hostname CRON[24426]: (CRON) info (No MTA installed, 
discarding output)
Feb  4 02:30:01 hostname CRON[24425]: (CRON) info (No MTA installed, 
discarding output)
Feb  4 02:30:20 hostname piler-smtp[1144]: connected from 
192.168.165.252:58292 on fd=6 (active connections: 1)
Feb  4 02:30:20 hostname piler-smtp[1144]: connected from 
192.168.165.252:58294 on fd=9 (active connections: 2)
Feb  4 02:30:20 hostname piler-smtp[1144]: disconnected from 
192.168.165.252 on fd=6 (1 active connections)
Feb  4 02:30:20 hostname piler-smtp[1144]: disconnected from 
192.168.165.252 on fd=9 (0 active connections)
Feb  4 02:30:27 hostname piler-smtp[1144]: connected from 
192.168.165.154:41139 on fd=6 (active connections: 1)

Feb  4 02:32:18 hostname piler-smtp[1144]: client 192.168.165.154 timeout
Feb  4 02:32:18 hostname piler-smtp[1144]: disconnected from 
192.168.165.154 on fd=6 (0 active connections)

Feb  4 02:32:18 hostname piler-smtp[1144]: Removing 6KHX6ZMY7OD1R0KG
Feb  4 02:32:18 hostname piler-smtp[1144]: received: #013}^?, 
from=, size=4940, client=192.168.165.154, fd=6
Feb  4 02:32:18 hostname piler-smtp[1144]: ERROR: couldn't rename 
#013}^? to 2/ #013}^? (reason: no such file or directory)
Feb  4 02:32:18 hostname piler-smtp[1144]: got signal: 11, Segmentation 
fault

Feb  4 02:32:18 hostname piler-smtp[1144]: piler-smtp has been terminated

Feb  5 02:32:15 hostname piler-smtp[28329]: connected from 
192.168.165.252:57818 on fd=6 (active connections: 1)
Feb  5 02:32:15 hostname piler-smtp[28329]: connected from 
192.168.165.252:57820 on fd=9 (active connections: 2)
Feb  5 02:32:15 hostname piler-smtp[28329]: disconnected from 
192.168.165.252 on fd=6 (1 active connections)
Feb  5 02:32:15 hostname piler-smtp[28329]: disconnected from 
192.168.165.252 on fd=9 (0 active connections)
Feb  5 02:33:15 hostname piler-smtp[28329]: connected from 
192.168.165.252:60458 on fd=6 (active connections: 1)
Feb  5 02:33:15 hostname piler-smtp[28329]: connected from 
192.168.165.252:60460 on fd=9 (active connections: 2)
Feb  5 02:33:15 hostname piler-smtp[28329]: disconnected from 
192.168.165.252 on fd=6 (1 active connections)
Feb  5 02:33:15 hostname piler-smtp[28329]: disconnected from 
192.168.165.252 on fd=9 (0 active connections)
Feb  5 02:34:15 hostname piler-smtp[28329]: connected from 
192.168.165.252:34840 on fd=6 (active connections: 1)
Feb  5 02:34:15 hostname piler-smtp[28329]: connected from 
192.168.165.252:34842 on fd=9 (active connections: 2)
Feb  5 02:34:15 hostname piler-smtp[28329]: disconnected from 
192.168.165.252 on fd=6 (1 active connections)
Feb  5 02:34:15 hostname piler-smtp[28329]: disconnected from 
192.168.165.252 on fd=9 (0 active connections)
Feb  5 02:34:29 hostname piler-smtp[28329]: connected from 
192.168.165.154:25626 on fd=6 (active connections: 1)
Feb  5 02:35:01 hostname CRON[30058]: (piler) CMD (/usr/bin/find 
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb  5 02:35:01 hostname CRON[30060]: (root) CMD (command -v debian-sa1 
> /dev/null && debian-sa1 1 1)
Feb  5 02:35:01 hostname CRON[30059]: (piler) CMD 
(/usr/local/libexec/piler/indexer.delta.sh)
Feb  5 02:35:01 hostname CRON[30063]: (piler) CMD (/usr/bin/find 
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb  5 02:35:01 hostname piler: INDEXER ERROR: indexer merging to main 
index is already running. It started at Fr 5. Feb 02:30:01 CET 2021

Re: Piler 1.35 on Ubuntu 18.04 - piler-smtp SIGSEGV

2021-02-09 Thread sj




Hello Patrick,

I've been reported such issues when a backup system paused or froze
the virtual machine running piler. I suspect that it might be the
case for you based on the time and days you mentioned.

The problem in such cases is that piler-smtp is in the middle of
processing an email, then the system is frozen for a few minutes
enough to pass the smtp timeout limit (60 sec), and then when it's
unfrozen then piler-smtp notices the timeout situation, then tears
down the connection. However, at the end of the session it tries
to tear down the already terminated and freed connection data.

So, let's try the following: get the master branch, recompile piler,
however replace the piler-smtp binary only, nothing else. Then let's
see if the problem is solved.

Janos

On 2021-02-09 07:56, Wagner, Patrick wrote:

Hello everyone,

a few weeks ago 2 of our Piler instances began segfaulting - more
precisely, several times per month (but not every day night) at about
02:30 am local time the MS Exchange server that has the Piler instance
setup as journaling target connects to piler-smtp, upon which
piler-smtp segfaults after trying to store an email that clearly
doesn't get parsed correctly by piler-smtp as it is represented as
gibberish (and gibberish cannot be renamed...)

See below for a few of the latest examples. Interestingly, the SIGSEGV
did not happen Sunday and Monday night, meaning the day following a
non-weekday? Not going on long enough that I could call it a pattern,
but in any case it would be nice if piler-smtp didn't segfault in the
first place.

What can I do to supply more useful information to get this fixed? Or
has there been meaningful work on piler-smtp in the meantime that this
could / would've already been fixed?


Thanks, Patrick


Feb  4 02:30:01 hostname CRON[24428]: (piler) CMD (/usr/bin/find
/var/www/piler/tmp -type f -name i.\* -exec rm -f {} \;)
Feb  4 02:30:01 hostname CRON[24430]: (piler) CMD (/usr/bin/find
/var/piler/error -type f|wc -l > /var/piler/stat/error)
Feb  4 02:30:01 hostname CRON[24429]: (piler) CMD
(/usr/local/libexec/piler/indexer.main.sh)
Feb  4 02:30:01 hostname CRON[24431]: (piler) CMD (/usr/bin/indexer
--quiet note1 --rotate)
Feb  4 02:30:01 hostname CRON[24436]: (piler) CMD (/usr/bin/indexer
--quiet tag1 --rotate)
Feb  4 02:30:01 hostname piler: INDEXER INFO: merging to main started
Feb  4 02:30:01 hostname CRON[24426]: (CRON) info (No MTA installed,
discarding output)
Feb  4 02:30:01 hostname CRON[24425]: (CRON) info (No MTA installed,
discarding output)
Feb  4 02:30:20 hostname piler-smtp[1144]: connected from
192.168.165.252:58292 on fd=6 (active connections: 1)
Feb  4 02:30:20 hostname piler-smtp[1144]: connected from
192.168.165.252:58294 on fd=9 (active connections: 2)
Feb  4 02:30:20 hostname piler-smtp[1144]: disconnected from
192.168.165.252 on fd=6 (1 active connections)
Feb  4 02:30:20 hostname piler-smtp[1144]: disconnected from
192.168.165.252 on fd=9 (0 active connections)
Feb  4 02:30:27 hostname piler-smtp[1144]: connected from
192.168.165.154:41139 on fd=6 (active connections: 1)
Feb  4 02:32:18 hostname piler-smtp[1144]: client 192.168.165.154 
timeout

Feb  4 02:32:18 hostname piler-smtp[1144]: disconnected from
192.168.165.154 on fd=6 (0 active connections)
Feb  4 02:32:18 hostname piler-smtp[1144]: Removing 6KHX6ZMY7OD1R0KG
Feb  4 02:32:18 hostname piler-smtp[1144]: received: #013}^?,
from=, size=4940, client=192.168.165.154, fd=6
Feb  4 02:32:18 hostname piler-smtp[1144]: ERROR: couldn't rename
#013}^? to 2/ #013}^? (reason: no such file or
directory)
Feb  4 02:32:18 hostname piler-smtp[1144]: got signal: 11, Segmentation 
fault
Feb  4 02:32:18 hostname piler-smtp[1144]: piler-smtp has been 
terminated


Feb  5 02:32:15 hostname piler-smtp[28329]: connected from
192.168.165.252:57818 on fd=6 (active connections: 1)
Feb  5 02:32:15 hostname piler-smtp[28329]: connected from
192.168.165.252:57820 on fd=9 (active connections: 2)
Feb  5 02:32:15 hostname piler-smtp[28329]: disconnected from
192.168.165.252 on fd=6 (1 active connections)
Feb  5 02:32:15 hostname piler-smtp[28329]: disconnected from
192.168.165.252 on fd=9 (0 active connections)
Feb  5 02:33:15 hostname piler-smtp[28329]: connected from
192.168.165.252:60458 on fd=6 (active connections: 1)
Feb  5 02:33:15 hostname piler-smtp[28329]: connected from
192.168.165.252:60460 on fd=9 (active connections: 2)
Feb  5 02:33:15 hostname piler-smtp[28329]: disconnected from
192.168.165.252 on fd=6 (1 active connections)
Feb  5 02:33:15 hostname piler-smtp[28329]: disconnected from
192.168.165.252 on fd=9 (0 active connections)
Feb  5 02:34:15 hostname piler-smtp[28329]: connected from
192.168.165.252:34840 on fd=6 (active connections: 1)
Feb  5 02:34:15 hostname piler-smtp[28329]: connected from
192.168.165.252:34842 on fd=9 (active connections: 2)
Feb  5 02:34:15 hostname piler-smtp[28329]: disconnected from
192.168.165.252 on fd=6 (1 active connections)
Feb  5 02:34:15 hostname pil