It's not ntop segfaulting, it's gdbm doing an exit():
gdbmfetch.c:
/* Copy the data if the key was found. */
if (elem_loc >= 0)
{
/* This is the item. Return the associated data. */
return_val.dsize = dbf->bucket->h_table[elem_loc].data_size;
if (return_val.dsize == 0)
return_val.dptr = (char *) malloc (1);
else
return_val.dptr = (char *) malloc (return_val.dsize);
if (return_val.dptr == NULL) _gdbm_fatal (dbf, "malloc error");
bcopy (find_data, return_val.dptr, return_val.dsize);
}
But:
void
_gdbm_fatal (dbf, val)
gdbm_file_info *dbf;
char *val;
{
if ((dbf != NULL) && (dbf->fatal_err != NULL))
(*dbf->fatal_err) (val);
else
{
write (STDERR_FILENO, "gdbm fatal: ", 12);
if (val != NULL)
write (STDERR_FILENO, val, strlen(val));
write (STDERR_FILENO, "\n", 1);
}
exit (1);
/* NOTREACHED */
}
The exit(1) will kill the entire process, so there's no recovery possible.
-----Burton
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Robbert Kouprie
Sent: Tuesday, April 26, 2005 3:12 AM
To: [email protected]
Subject: RE: [Ntop-dev] GDBM data corruption causing ntop crash [was:
Datacorruption [was: Ntop segfaulting] (fwd)]
Hey,
On Mon, 25 Apr 2005, Burton Strauss wrote:
> I think that's a marginal idea Batman ... It sounds - from the ENOMEM
> - as though a record is in the database w/ a huge record size. This
> means that any attempt to read it - by any program invoking gdbm_fetch
> - will crash and burn - and not in a way we can recover from.
So we segfault??
> Some sort of stand-alone utility would be better, but since ntop will
> recreate the file if necessary, it's better to just nuke it. And
> don't think about a periodic purge + reorg (you would have to lock for
> the duration of the reorg). That's just asking for trouble :-(
If you would use a standalone utility that nukes corrupted dbms when found,
you would have to restart ntop anyway to recreate the dbms. So then we might
add it to ntop initialization anyway.
The only benefit from a standalone utility would be a "if (corrupt) { purge
}" style of thing where ntop could stay alive during the purge.
Regards,
Robbert
>
>
>
>
> -----Burton
>
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
> Behalf Of Robbert Kouprie
> Sent: Monday, April 25, 2005 2:17 PM
> To: [email protected]
> Subject: [Ntop-dev] GDBM data corruption causing ntop crash [was: Data
> corruption [was: Ntop segfaulting] (fwd)]
>
> Hi Burton,
>
> Thanks for the info!
>
> Burton Strauss wrote:
>> There's two thoughts - either (1) gdbm db has been corrupted (leading
>> ntop to put bad stuff into memory) or (2) something else corrupted
>> the HostTraffic chains.
>
> It looks like (1), see below.
>
>> You can try using dumpgdbm (I've posted this before) and/or
>> dnscachePurge (SourceForge).
>
> Ah, neat tools. I wrote a simple Perl script myself (called "dumpgdbm.pl"
> hehe) but your tool is a little bit more verbose.
>
> Anyway, both tools crash on a certain entry in de dbm file:
>
> Below is the strace of your dumpgdbm when failing. First I included
> two correctly processed entries. The last one is failing. Notice the
> integer '4096' (which should indicate the length of the field?).
>
> lseek(3, 36898923, SEEK_SET) = 36898923
> read(3, "1150659434\0s0106000c6e23cade.ed."..., 83) = 83
> write(1, " \'1150659434\' : ( 72) 7330"..., 83 '1150659434'
> : ( 72) 73303130 36303030 63366532 33636164 s0106000c6e23cad
> ) = 83
> write(1, " 652e"..., 83
> 652e6564 2e736861 77636162 6c652e6e e.ed.shawcable.n
> ) = 83
> write(1, " 6574"..., 83
> 65740000 00000000 00000000 00000000 et..............
> ) = 83
> write(1, " 0000"..., 83
> 00000000 00000000 00000000 00000000 ................
> ) = 83
> write(1, " 1f7e"..., 83
> 1f7e4742 1d000000 .~GB............
> ) = 83
> lseek(3, 4242015, SEEK_SET) = 4242015
> read(3, "1143704778\0pcp05033799pcs.plyntv"..., 83) = 83
> write(1, " \'1143704778\' : ( 72) 7063"..., 83 '1143704778'
> : ( 72) 70637030 35303333 37393970 63732e70 pcp05033799pcs.p
> ) = 83
> write(1, " 6c79"..., 83
> 6c796e74 7630312e 6d692e63 6f6d6361 lyntv01.mi.comca
> ) = 83
> write(1, " 7374"..., 83
> 73742e6e 65740000 00000000 00000000 st.net..........
> ) = 83
> write(1, " 0000"..., 83
> 00000000 00000000 00000000 00000000 ................
> ) = 83
> write(1, " 580e"..., 83
> 580e4442 1d000000 X.DB............
> ) = 83
> lseek(3, 35229696, SEEK_SET) = 35229696
> read(3, "2191640561\0dct9241.dct.tudelft.n"..., 4096) = 4096
> mmap2(NULL, 2000834560, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
> brk(0) = 0x816f000
> brk(0x7f5a7000) = 0x816f000
> mmap2(NULL, 2000969728, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = -1 ENOMEM (Cannot allocate memory) mmap2(NULL, 2097152,
> PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE,
> -1, 0) = 0x40862000
> munmap(0x40862000, 647168) = 0
> munmap(0x40a00000, 401408) = 0
> mprotect(0x40900000, 135168, PROT_READ|PROT_WRITE) = 0 mmap2(NULL,
> 2000834560, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
> -1 ENOMEM (Cannot allocate memory)
> write(2, "gdbm fatal: ", 12gdbm fatal: ) = 12
> write(2, "malloc error", 12malloc error) = 12
> write(2, "\n", 1
> ) = 1
> munmap(0xb7fe8000, 4096) = 0
> exit_group(1) = ?
>
> Anyway, the database is corrupt for some reason. Let's blame it on the
> hardware for now (I'm still testing). But would it be wise to add some
> kind of check at ntop start, that tries to read in all gdbms that we use?
>
> It helps taking the blame off ntop when gdbms are corrupt, and it
> helps bughunting.
>
> - Robbert
> _______________________________________________
> Ntop-dev mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev
>
> _______________________________________________
> Ntop-dev mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev
>
>
>
_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev