-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Adrian,
I'm sorry for the trouble. A few words on this:
Disk full events are a general problem and handling them is not very easy,
as almost everything can happen, including junking the entire installation.
NfSen and nfdump have a lot of error checking included and log any errors
to syslog, whenever anything is wrong - This is the good side.
Cleaning up a system like nfsen from a full disk event is not an easy job
and depending on your layout it may stay corrupted.

What to do and "way does expire not work"?

expire works on the profile bases and not on the disk volume bases. Having many 
profiles
on the disk - each profile is handled separately. Nfsen can not handle the 
disk. Any
other data on the disk including profile stat data is not taken into account. 
btw
profile stat data is limited in size and will not grow, unless new 
profiles/channels
are added. So the sum of all the profiles/data must be less than the disk(s) 
you spread
NfSen. I run Nfsen for years now on a 2TB diskarray for the live profile only ( 
other profiles on 
other slots ) with an expire level of 99.99% - and it runs without any problem.

Therefore PROFILESTATDIR and PROFILEDATADIR in nfsen.conf should be on 
different disks,
because, when a disk full event occurs the profile stats suffer as well and may 
get
corrupted, which has serious impacts on the entire installation.

Final nfsen 1.3 should include a panic shutdown, when PROFILEDATADIR full event 
is
detected, or by the limit set manually, to prevent further damage.

Nevertheless, I will follow your logs to see where to improve nfsen handling 
this event better.
The Hierarchy check is certainly a candidate.

I know, that this will not bring back your data, but hope it helps a bit anyway.

    - Peter

- --On January 30, 2007 9:30:23 +0200 Adrian Popa <[EMAIL PROTECTED]> wrote:

| Hello everybody,
|
| I had my nfsen installation dumping files on a 246G partition and I had
| set the expire threshold for the live profile to 220G. I had 2 other
| profiles, each expiring at 1G. There was no time limit for expiration.
|
| This morning I had the unpleasant surprise to see that the whole
| partition was 100% full, and nfsen was in agony...
|
| Here are some snipletts from the log file:
| ...
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Process_v9: output
| buffer size error. Abort v9 record processing
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Failed to write
| output buffer to disk: 'No space left on device'
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Process_v9: output
| buffer size error. Abort v9 record processing
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Failed to write
| output buffer to disk: 'No space left on device'
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Process_v9: output
| buffer size error. Abort v9 record processing
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Failed to write
| output buffer to disk: 'No space left on device'
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Process_v9: output
| buffer size error. Abort v9 record processing
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Failed to write
| output buffer to disk: 'No space left on device'
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Failed to write
| output buffer to disk: 'No space left on device'
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20199]: Can't rename dump
| file: No space left on device
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Ident: '7304bb2'
| Flows: 20164, Packets: 152128, Bytes: 70428325, Sequence Errors: 13468,
| Bad Packets: 0
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20199]: Terminating nfcapd.
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20196]: Terminating nfcapd.
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20193]: Ident: '7304bcnt2'
| Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 26771, Bad Packets: 0
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20193]: Terminating nfcapd.
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20205]: Ident: '7606_2_Lab'
| Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20208]: Ident: '7606_1_Lab'
| Flows: 0, Packets: 0, Bytes: 0, Sequence Errors: 0, Bad Packets: 0
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20205]: Terminating nfcapd.
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20208]: Terminating nfcapd.
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20202]: Can't rename dump
| file: No space left on device
| Jan 30 08:14:52 hail /usr/local/bin/nfcapd[20202]: Terminating nfcapd.
|
| This is not the issue - I understand that file expiration is not exact
| on the limit...
| So, I set out to free some space on the partition, by deleting old flow
| files.
|
| After I restarted nfsen, I got this problem:
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20241]: Standard setsockopt,
| SO_RCVBUF is 135168 Requested length is 200000 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20241]: System set
| setsockopt, SO_RCVBUF to 262142 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20243]: Startup.
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20243]: Process_v9: New
| exporter domain 0
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20244]: Standard setsockopt,
| SO_RCVBUF is 135168 Requested length is 200000 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20244]: System set
| setsockopt, SO_RCVBUF to 262142 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20246]: Startup.
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20246]: Process_v9: New
| exporter domain 0
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20247]: Standard setsockopt,
| SO_RCVBUF is 135168 Requested length is 200000 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20247]: System set
| setsockopt, SO_RCVBUF to 262142 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20249]: Startup.
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20250]: Standard setsockopt,
| SO_RCVBUF is 135168 Requested length is 200000 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20250]: System set
| setsockopt, SO_RCVBUF to 262142 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20252]: Startup.
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20253]: Standard setsockopt,
| SO_RCVBUF is 135168 Requested length is 200000 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20253]: System set
| setsockopt, SO_RCVBUF to 262142 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20255]: Startup.
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20256]: Standard setsockopt,
| SO_RCVBUF is 135168 Requested length is 200000 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20256]: System set
| setsockopt, SO_RCVBUF to 262142 bytes
| Jan 30 08:16:03 hail /usr/local/bin/nfcapd[20258]: Startup.
| Jan 30 08:16:03 hail nfsen[20259]: Startup. Version: snapshot-20070110
| $Id: nfsend 60 2007-01-09 12:26:47Z peter $
| Jan 30 08:16:03 hail nfsen[20259]: Verification sub hierarchy failed.
| Expected file '/data/nfsen/profiles/live/7304bcnt2//nfcapd.200701292045'
| does not exist!
|
| Jan 30 08:16:03 hail nfsen[20259]: This may indicate an inconsitency
| between configured sub hierarchy layout and real layout.
| Jan 30 08:16:03 hail nfsen[20259]: Rerun RebuildHierarchy.pl to fix.
|
| There were a lot of missing files, because they couldn't be created.
| Unfortunatelly, RebuildHierarchy couldn't recreate them :( (it had an
| undefined variable somewhere - I didn't write down the line number - sorry).
|
| I recreated some files by hand (with touch), and nfsen seemed to start
| allright - unfortunatelly, the web interface had 'Graph Errors'.
| Jan 30 08:23:17 hail nfsen[20437]: Update profile live in group .
| Jan 30 08:23:17 hail nfsen[20437]: Failed get stat info for requested
| time slot
|
| Jan 30 08:23:24 hail prefixStats: comm server started: 21578
| Jan 30 08:23:24 hail prefixStats: Error generating details graph: Arg:
| 'live', '', 'TCP', 'flows', '', '-86400', '', '1170013200',
| '1170013200', '288', '100',
|  '1', '0', '0'
| Jan 30 08:23:24 hail prefixStats: comm server started: 21581
| Jan 30 08:23:24 hail prefixStats: Error generating details graph: Arg:
| 'live', '', 'UDP', 'flows', '', '-86400', '', '1170013200',
| '1170013200', '288', '100',
|  '1', '0', '0'
|
| After a few more tries, I tried to delete the sources and add them
| again. I had problems again - lots of uninitialized variables and
| nothing done when trying to readd the sources. Unfortunatelly, I didn't
| save the error messages, but maybe you can get an idea from syslog (I
| was doing ./nfsen reconfig):
|
| Jan 30 08:31:58 hail nfsen[28716]: Startup. Version: snapshot-20070110
| $Id: nfsend 60 2007-01-09 12:26:47Z peter $
| Jan 30 08:31:58 hail nfsen[28718]: Comm server started: [28718]
| Jan 30 08:31:58 hail nfsen[28717]: nfsend: [28717]
| Jan 30 08:31:58 hail nfsen[28717]: Use of uninitialized value in join or
| string at /data/nfsen/libexec/NfProfile.pm line 772, <ProFILE> line 15.
| Jan 30 08:31:58 hail last message repeated 2 times
| Jan 30 08:31:58 hail nfsen[28717]: Update profile live in group .
| Jan 30 08:31:58 hail nfsen[28717]: Error GenGraph: Profile: live,
| traffic-day: parameter '7304bcnt21170138600' does not represent a number
| in line AREA:7304bc
| nt21170138600:7304bcnt2
| Jan 30 08:31:58 hail last message repeated 3 times
| Jan 30 08:31:58 hail nfsen[28717]: Error GenGraph: Profile: live,
| flows-day: parameter '7304bcnt21170138600' does not represent a number
| in line AREA:7304bcnt
| 21170138600:7304bcnt2
| Jan 30 08:31:58 hail last message repeated 3 times
| Jan 30 08:31:58 hail nfsen[28717]: Error GenGraph: Profile: live,
| packets-day: parameter '7304bcnt21170138600' does not represent a number
| in line AREA:7304bc
| nt21170138600:7304bcnt2
| Jan 30 08:31:58 hail last message repeated 3 times
| Jan 30 08:31:58 hail nfsen[28717]: Error graph update: Error GenGraph:
| Profile: live, packets-day: parameter '7304bcnt21170138600' does not
| represent a number
|  in line AREA:7304bcnt21170138600:7304bcnt2
| Jan 30 08:31:58 hail nfsen[28717]: Use of uninitialized value in join or
| string at /data/nfsen/libexec/NfProfile.pm line 827.
| Jan 30 08:31:58 hail last message repeated 2 times
|
| The solution was to completely delete /data/nfsen and reinstall it from
| scratch. Now it works ok, but I don't know how I could have fixed it
| without reinstalling.
|
| Also, why didn't the expire settings work in the first place? How much
| free space do I have to leave on the partition to ensure that this
| doesn't happen again?
|
| Thank you for your time.
|
| --
| Adrian Popa
|
|
|
|
| -------------------------------------------------------------------------
| Take Surveys. Earn Cash. Influence the Future of IT
| Join SourceForge.net's Techsay panel and you'll get the chance to share your
| opinions on IT & business topics through brief surveys - and earn cash
| http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
| _______________________________________________
| Nfsen-discuss mailing list
| [email protected]
| https://lists.sourceforge.net/lists/listinfo/nfsen-discuss



- --
_______ SWITCH - The Swiss Education and Research Network ______
Peter Haag,  Security Engineer,  Member of SWITCH CERT
PGP fingerprint: D9 31 D5 83 03 95 68 BA  FB 84 CA 94 AB FC 5D D7
SWITCH,  Limmatquai 138,  CH-8001 Zurich,  Switzerland
E-mail: [EMAIL PROTECTED] Web: http://www.switch.ch/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iQCVAwUBRb8jFP5AbZRALNr/AQIzFQQAkrQ53zjSRMjb/q7jws+BQhKyFtKKYCr5
1U6LieU2XjT9Su6HBPbXo4eI+VRgiza/l8VBj27tbpBIYr47sC0kwBh5YNFZWeQY
4zG2P5pdk2XUcWOsL/pqwH+MZ+lQGU2kpxnzvxpP8ng/oVWgOnWLUpv8vd8Z88pT
T8J/l8fAk0U=
=8k4a
-----END PGP SIGNATURE-----


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nfsen-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nfsen-discuss

Reply via email to