Hi László,

Thanks for getting back to me

On Sat, 29 Nov 2014, László Böszörményi (GCS) wrote:
> > Package: libsqlite3-0
> > Version: 3.7.13-1+deb7u1
> > Severity: serious

> > Originally detected I believe on sid installation, but forgot to capture the
> > version, I will try to replicate/report later.  But very consistent with 
> > wheezy
> > (from which I am reporting now).
>  May you give me some details how it happened in Sid?

ok -- I will try to replicate again under sid (trickier since I have no sid on
publicly bombarded servers)

> > Triggered by the backport fail2ban 0.9.1-1~nd70+1 (available from
> > http://neuro.debian.net/debian-devel/ wheezy/main amd64 Packages  apt repo) 
> > it
> > gets to
> [...]
>  The problem is, I don't see the segfault in the mentioned gdb output.

well -- I didn't think it would be useful since backtrace has that call
on top and you would just trust me ;) but here you go:

# gdb --args /usr/bin/python-dbg /usr/bin/fail2ban-server -s 
/var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -f
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/python-dbg...done.
(gdb) c
The program is not being run.
(gdb) r
Starting program: /usr/bin/python-dbg /usr/bin/fail2ban-server -s 
/var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -f
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
2014-11-24 08:52:18,796 fail2ban.server         [29794]: INFO    Starting 
Fail2ban v0.9.1
2014-11-24 08:52:26,467 fail2ban.server         [29794]: INFO    Stopping all 
jails
[New Thread 0x7ffff52a4700 (LWP 29812)]
[New Thread 0x7ffff4aa3700 (LWP 29813)]
[New Thread 0x7fffeffff700 (LWP 29814)]
[Thread 0x7ffff52a4700 (LWP 29812) exited]
[Thread 0x7fffeffff700 (LWP 29814) exited]
[Thread 0x7ffff4aa3700 (LWP 29813) exited]
[New Thread 0x7ffff4aa3700 (LWP 29895)]
[New Thread 0x7fffeffff700 (LWP 29896)]
[New Thread 0x7ffff52a4700 (LWP 29897)]
[Thread 0x7ffff4aa3700 (LWP 29895) exited]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff52a4700 (LWP 29897)]
sqlite3_value_type (pVal=0x0) at sqlite3.c:63805
63805   sqlite3.c: No such file or directory.
(gdb)


> > unfortunately we haven't logged the sql queries so not sure on exact one, 
> > but I
> > think it was this one, which if executed from the shell seems to not cause 
> > the
> > segfault...

> > n {1..100}; do sqlite3 -list fail2ban.sqlite3 'SELECT ip, timeofban, data 
> > FROM bans WHERE 1 AND jail="sshd" AND ip="111.74.239.35" ORDER BY ip, 
> > timeofban' >/dev/null && echo success; done
>  Then how often do you get segfault? 

once in few days

> Do you have any additional
> information if it happens in a given daytime, when there are several
> bots try to get into your system or anything else?

I will keep monitoring/analyzing more -- so far haven't spotted 

> > Please help me to troubleshoot this one if more information is necessary
> > to point the issue
>  I'm the SQLite3 maintainer and not the fail2ban one

FWIW -- I am the Fail2ban maintainer... but fail2ban is just a 'trigger'
here -- it is a purely Python implementation so either it is sqlite3 or python
bindings which are at fault here

>. But please note
> the changelog the version you use[1]:
>      - 0.9 series is quite a big leap in development, especially since 0.8.6
>        which made it to previous Debian stable wheezy.  Please consult 
> upstream
>        ChangeLog about changes

> Did you check it, reviewed your configuration? 

configuration of ... fail2ban?  it is somewhat minimalistic -- just a single
jail, that is why my original suspicion of possibly incorrect locking for the
same DB across threads didn't get much of my "mental support" ;) Although there
is still also few threads  involved :

  Id   Target Id         Frame 
* 7    Thread 0x7ffff52a4700 (LWP 29897) "python-dbg" sqlite3_value_type 
(pVal=0x0) at sqlite3.c:63805
  6    Thread 0x7fffeffff700 (LWP 29896) "python-dbg" 0x0000000000525338 in 
PyEval_EvalCodeEx (co=0xbe9ca0, globals=
    {'logMultiprocessing': 1, '__path__': ['/usr/lib/python2.7/logging'], 
'LogRecord': <type at remote 0xc29c70>, 'logProcesses': 1, 'addLevelName': 
<function at remote 0xc6e6f0>, '_addHandlerRef': <function at remote 0xc70798>, 
'WARNING': 30, 'fatal': <function at remote 0xc751b0>, 'currentframe': 
<function at remote 0xc6e648>, 'INFO': 20, '_startTime': <float at remote 
0x9d38f8>, '__file__': '/usr/lib/python2.7/logging/__init__.pyc', 
'BufferingFormatter': <type at remote 0xc2aca0>, 'NOTSET': 0, '_levelNames': 
{0: 'NOTSET', 'NOTICE': 25, 10: 'DEBUG', 'WARN': 30, 20: 'INFO', 'ERROR': 40, 
'DEBUG': 10, 25: 'NOTICE', 30: 'WARNING', 'INFO': 20, 'WARNING': 30, 40: 
'ERROR', 50: 'CRITICAL', 'CRITICAL': 50, 'NOTSET': 0}, '__date__': '07 February 
2010', 'getLogger': <function at remote 0xc75108>, 'debug': <function at remote 
0xc754f8>, 'basicConfig': <function at remote 0xc718e8>, 'cStringIO': <module 
at remote 0xbee9b8>, '_acquireLock': <function at remote 0xc6e840>, 'atexit': 
<module at remote 0xc55a20>, 'Formatter': <t...(truncated), locals=0x0, 
args=0x10cc8e8, argcount=2, kws=
    0x10cc8f8, kwcount=0, defs=0x0, defcount=0, closure=0x0) at 
../Python/ceval.c:3068
  1    Thread 0x7ffff7fb9700 (LWP 29794) "python-dbg" 0x00007ffff706b2b3 in 
select () at ../sysdeps/unix/syscall-template.S:82


but code at 6 seems unlikely to deal with DB at that point:

(gdb) thread 6
[Switching to thread 6 (Thread 0x7fffeffff700 (LWP 29896))]
#0  0x0000000000525338 in PyEval_EvalCodeEx (co=0xbe9ca0, globals=
    {'logMultiprocessing': 1, '__path__': ['/usr/lib/python2.7/logging'], 
'LogRecord': <type at remote 0xc29c70>, 'logProcesses': 1, 'addLevelName': 
<function at remote 0xc6e6f0>, '_addHandlerRef': <function at remote 0xc70798>, 
'WARNING': 30, 'fatal': <function at remote 0xc751b0>, 'currentframe': 
<function at remote 0xc6e648>, 'INFO': 20, '_startTime': <float at remote 
0x9d38f8>, '__file__': '/usr/lib/python2.7/logging/__init__.pyc', 'Buf
feringFormatter': <type at remote 0xc2aca0>, 'NOTSET': 0, '_levelNames': {0: 
'NOTSET', 'NOTICE': 25, 10: 'DEBUG', 'WARN': 30, 20: 'INFO', 'ERROR': 40, 
'DEBUG': 10, 25: 'NOTICE', 30: 'WARNING', 'INFO': 20, 'WARNING': 30, 40: 
'ERROR', 50: 'CRITICAL', 'CRITICAL': 50, 'NOTSET': 0}, '__date__': '07 February 
2010', 'getLogger': <function at remote 0xc75108>, 'debug': <function at remote 
0xc754f8>, 'basicConfig': <function at remote 0xc718e8>, 'cStringIO': <module 
at remote 0xbee9b8>, '_acquireLock': <function at remote 0xc6e840>, 'atexit': 
<module at remote 0xc55a20>, 'Formatter': <t...(truncated), locals=0x0, 
args=0x10cc8e8, argcount=2, kws=
    0x10cc8f8, kwcount=0, defs=0x0, defcount=0, closure=0x0) at 
../Python/ceval.c:3068
3068    ../Python/ceval.c: No such file or directory.
(gdb) py-bt
#3 Frame 0x10cc750, for file 
/usr/lib/python2.7/dist-packages/fail2ban/server/datedetector.py, line 204, in 
sortTemplate (self=<DateDetector(_DateDetector__lock=<thread.lock at remote 
0x106f040>, 
_DateDetector__templates=[<DatePatternRegex(_cRegex=<_sre.SRE_Pattern at remote 
0x10a05a0>, hits=25832, _pattern='(?:%a )?%b %d %H:%M:%S(?:\\.%f)?(?: %Y)?', 
_regex='\\b(?:(?P<a>mon|tue|wed|thu|fri|sat|sun) 
)?(?P<b>jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec) 
(?P<d>3[0-1]|[1-2]\\d|0[1-9]|[1-9]| [1-9]) 
(?P<H>2[0-3]|[0-1]\\d|\\d):(?P<M>[0-5]\\d|\\d):(?P<S>6[0-1]|[0-5]\\d|\\d)(?:\\.(?P<f>[0-9]{1,6}))?(?:
 (?P<Y>\\d\\d\\d\\d))?', _name='(?:DAY )?MON Day 
24hour:Minute:Second(?:\\.Microseconds)?(?: Year)?') at remote 0x1061840>, 
<DatePatternRegex(_cRegex=<_sre.SRE_Pattern at remote 0x10a0070>, hits=0, 
_pattern='%Y(?P<_sep>[-/.])%m(?P=_sep)%d %H:%M:%S(?:,%f)?', 
_regex='\\b(?P<Y>\\d\\d\\d\\d)(?P<_sep>[-/.])(?P<m>1[0-2]|0[1-9]|[1-9])(?P=_sep)(?P<d>3[0-1]|[1-2]\\d|0[1-9]|[1-9]|
 [1-9]) (?P<H>2[0-3]|[0-1]\\d|\\d):(?P<M>[0-5]\\d|\...(truncated)
    logSys.debug("Sorting the template list")


and 1 is the main server process which is also IIRC is not dealing with DB
itself.

> Does a segfault happen
> in other applications that link to SQLite3?

not that I aware of (that is why was trying to do that silly sqlite3 loop
trying to replicate)

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Research Scientist,            Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to