Hi,
I manage several databases on FreeBSD 2.2.8-STABLE and after upgrading to
PostgreSQL 6.5.1 (from ports collection) I'm getting quite a few backend
crashes.
The applications running on the databases are real-time RADIUS
authentication and accounting. There are 2 databases involved with one of
them being hit at least 16k queries/day from RADIUS alone.
There are also CGIs for querying the various info, for which I don't have
a good hits/day estimate.
Most of the applications are written in Perl using DBI and DBD::Pg.
There's also a patched QPopper that talks to one of the databases to
verify users' account and password.
I've got the backtraces of the crashes and they're crashing at the same
function even though the databases are very different, different schemas
and sizes. They are running under the same postmaster.
Both seem to crash at the same time (within the same minute) leading me to
believe that one of the crash provoked the other.
Here's a backtrace from one database that contain 450k+ records:
(no debugging symbols found)...Core was generated by `postgres'.
Program terminated with signal 10, Bus error.
Cannot access memory at address 0x2010d080.
#0 0x16f62 in _bt_isortcmpinit ()
#0 0x16f62 in _bt_isortcmpinit ()
#1 0x13c3c in btgettuple ()
#2 0xd6d8a in fmgr_c ()
#3 0xd7170 in fmgr ()
#4 0xd1d3 in index_getnext ()
#5 0x3ed77 in IndexNext ()
#6 0x3b838 in ExecScan ()
#7 0x3eed0 in ExecIndexScan ()
#8 0x39fc6 in ExecProcNode ()
#9 0x38f39 in ExecutePlan ()
#10 0x3871d in ExecutorRun ()
#11 0xa8b27 in ProcessQueryDesc ()
#12 0xa8b8c in ProcessQuery ()
#13 0xa6b3a in pg_exec_query_dest ()
#14 0xa69d7 in pg_exec_query ()
#15 0xa83bc in PostgresMain ()
#16 0x8e4ca in DoBackend ()
#17 0x8dfce in BackendStartup ()
#18 0x8d36e in ServerLoop ()
#19 0x8cbe7 in PostmasterMain ()
#20 0x4a0c2 in main ()
And here's the other one with 16k+ records:
(no debugging symbols found)...Core was generated by `postgres'.
Program terminated with signal 10, Bus error.
Cannot access memory at address 0x2010d080.
#0 0x16f62 in _bt_isortcmpinit ()
#0 0x16f62 in _bt_isortcmpinit ()
#1 0x15b22 in _bt_first ()
#2 0x13c4b in btgettuple ()
#3 0xd6d8a in fmgr_c ()
#4 0xd7170 in fmgr ()
#5 0xd1d3 in index_getnext ()
#6 0x21894 in CatalogIndexFetchTuple ()
#7 0x21958 in AttributeNameIndexScan ()
#8 0xd17f9 in SearchSysCache ()
#9 0xd4d6c in SearchSysCacheTuple ()
#10 0xd5218 in get_attnum ()
#11 0x6e9ad in colnameRangeTableEntry ()
#12 0x6acf7 in transformIdent ()
#13 0x6a7b6 in transformExpr ()
#14 0x6a579 in transformExpr ()
#15 0x696ad in transformWhereClause ()
#16 0x4b488 in transformSelectStmt ()
#17 0x4a395 in transformStmt ()
#18 0x4a13b in parse_analyze ()
#19 0x68f10 in parser ()
#20 0xa66cb in pg_parse_and_plan ()
#21 0xa6a55 in pg_exec_query_dest ()
#22 0xa69d7 in pg_exec_query ()
#23 0xa83bc in PostgresMain ()
#24 0x8e4ca in DoBackend ()
#25 0x8dfce in BackendStartup ()
#26 0x8d36e in ServerLoop ()
#27 0x8cbe7 in PostmasterMain ()
#28 0x4a0c2 in main ()
The first database above is the one with 16k queries/day and the second
one gets significantly less access, probably somewhere around 200
queries/day.
Has anyone seen this problem? I'd like to get a more detailed debugging
info if someone can show me how to compile it with debugging info.
Thanks for any info anyone can provide.
Cheers,
Anto.