we got a new coredump of 7.3.7 today.  this instance was running on a
freshly installed computer, to eliminate(?) all hardware issues.  it's
still the same brand and model, though.  the old system has been
running hard disk tests 30+ hours with no errors yet.

the core dump happens at the same place in the code, and this time we
got a complete backtrace:

(gdb) bt
#0  0xb734d07c in memcpy () from /lib/tls/libc.so.6
#1  0x0806bba8 in DataFill (data=0xb7488fff "", tupleDesc=0x82899a0, 
    value=0x8289980, nulls=0xbfffd3c0 "    n    \"", infomask=0x8806b04c, 
    bit=0x8806b04f "�\001") at heaptuple.c:139
#2  0x0806c3ee in heap_formtuple (tupleDescriptor=0x8279ec0, value=0x8289980, 
    nulls=0xbfffd3c0 "    n    \"") at heaptuple.c:623
#3  0x080d1af1 in ExecTargetList (targetlist=0x8278298, nodomains=9, 
    targettype=0x8279ec0, values=0x8289980, econtext=0x8279a60, 
    isDone=0xbfffd468) at execQual.c:2230
#4  0x080d1cdb in ExecScan (node=0x827a208, accessMtd=0xbfffd468)
    at execScan.c:49
#5  0x080d1d7d in ExecScan (node=0x8278c70, accessMtd=0x80d7c58 <SeqNext+24>)
    at execScan.c:146
#6  0x080d7cfb in InitScanRelation (node=0x82899a0, estate=0x8278c70, 
    scanstate=0xbfffd4c8) at nodeSeqscan.c:162
#7  0x080cfd86 in ExecProcNode (node=0x8289bf8, parent=0x0)
    at execProcnode.c:315
#8  0x080cecf3 in ExecutePlan (estate=0x8279c90, plan=0x8278c70, 
    operation=CMD_SELECT, numberTuples=0, direction=136878496, 
    destfunc=0x82899c8) at execMain.c:964
#9  0x080ce392 in ExecutorEnd (queryDesc=0x82899a0, estate=0x0)
    at execMain.c:223
#10 0x0811d069 in ProcessQuery (parsetree=0x82899c8, plan=0x8278c70, 
    dest=Remote, completionTag=0xbfffd610 "") at pquery.c:251
#11 0x0811b7ed in pg_exec_query_string (query_string=0xbfffd610, dest=Remote, 
    parse_context=0x823d610) at postgres.c:844
#12 0x0811c64d in PostgresMain (argc=4, argv=0xbfffd850, 
    username=0x8238c69 "cerebrum") at postgres.c:2018
#13 0x0810413d in DoBackend (port=0x8238b38) at postmaster.c:2304
#14 0x08103cb2 in BackendStartup (port=0x8238b38) at postmaster.c:1935
#15 0x08102dad in ServerLoop () at postmaster.c:1016
#16 0x081027ea in PostmasterMain (argc=1, argv=0x8220170) at postmaster.c:797
#17 0x080e1234 in main (argc=1, argv=0xbfffe204) at main.c:217



(gdb) print *att[i]
$20 = {attrelid = 0, attname = {
    data = "pageunits_total", '\0' <repeats 48 times>, 
    alignmentDummy = 1701273968}, atttypid = 1700, attstattarget = -1, 
  attlen = -1, attnum = 9, attndims = 0, attcacheoff = -1, atttypmod = 393220, 
  attbyval = 0 '\0', attstorage = 109 'm', attisset = 0 '\0', 
  attalign = 105 'i', attnotnull = 0 '\0', atthasdef = 0 '\0', 
  attisdropped = 0 '\0', attislocal = 1 '\001', attinhcount = 0}
(gdb) print i
$21 = 8
(gdb) x/10 value[i]
0xb7190928:     0x2f00000b      0x00000000      0x00200000      0x00000207
0xb7190938:     0x00000314      0x01bf913d      0x10120000      0x00090020
0xb7190948:     0xef201553      0x00000001


the relevant code again is:

       if (att[i]->attbyval)
           [...]
       else if (att[i]->attlen == -1)
           [...]
       else if (att[i]->attlen == -2)
           [...]
       else
       {
               /* fixed-length pass-by-reference */
               Assert(att[i]->attlen > 0);
               data_length = att[i]->attlen;
===>           memcpy(data, DatumGetPointer(value[i]), data_length);
       }

(gdb) print data_length
$25 = 788529163
(gdb) print att[i]->attlen
$26 = -1

how can att[i]->attlen possibly change in the interim?  but
data_length looks corrupted, too.

(gdb) print *att[i-1]
$27 = {attrelid = 0, attname = {
    data = "pageunits_paid", '\0' <repeats 49 times>, 
    alignmentDummy = 1701273968}, atttypid = 1700, attstattarget = -1, 
  attlen = -1, attnum = 8, attndims = 0, attcacheoff = -1, atttypmod = 393220, 
  attbyval = 0 '\0', attstorage = 109 'm', attisset = 0 '\0', 
  attalign = 105 'i', attnotnull = 0 '\0', atthasdef = 0 '\0', 
  attisdropped = 0 '\0', attislocal = 1 '\001', attinhcount = 0}

also:

(gdb) print data
$39 = 0xb7488fff ""

which doesn't seem very aligned for an integer.

(gdb) print data[1]
Cannot access memory at address 0xb7489000

thank you for any insights.
-- 
Kjetil T.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to [EMAIL PROTECTED] so that your
      message can get through to the mailing list cleanly

Reply via email to