Hi Ruben, I have committed a patch in the CVS that should be beneficial to the issue you reported. Can you pull latest code from CVS and let me know it appears to solve your issue? If not i have got in mind a second piece of code to touch.
Cheers, Paolo On Wed, Feb 05, 2014 at 08:28:17AM +0100, Ruben Laban wrote: > Hi Paolo, > > See inline... > > On 2014-02-04 18:26, Paolo Lucente wrote: > >Configuration looks allright. And it can't supposedly happen with > >libpcap that you have packets for the current (and/or future time > >slots) when writing to the file: this might happen with nfacctd > >instead, ie. in case NetFlow timestamps are in future because of > >routers not being NTP sync'd. > > > >I was wondering if plugin_buffer_size is not maybe too much for > >your scenario but then again you would not see a smaller amount > >of packets before a larger one. Maybe worth trying reducing it > >anyway and see if this has any effect. > > After cutting down both plugin_buffer_size and plugin_pipe_size by a > factor 10, I get the same behavior. Same with a factor 100, though > then I get a lot of: > > Feb 5 07:53:39 gw02 pmacctd[3806]: ERROR ( traffic/print ): We are > missing data. > Feb 5 07:53:39 gw02 pmacctd[3806]: If you see this message once in > a while, discard it. Otherwise some solutions follow: > Feb 5 07:53:39 gw02 pmacctd[3806]: - increase shared memory size, > 'plugin_pipe_size'; now: '4024000'. > Feb 5 07:53:39 gw02 pmacctd[3806]: - increase buffer size, > 'plugin_buffer_size'; now: '4024'. > Feb 5 07:53:39 gw02 pmacctd[3806]: - increase system maximum socket > size.#012 > > (Which also shows a \n that shouldn't be there.) > > >What pmacct version you are running? Should the buffering idea > >above not work, would it be an option to get temporary remote > >access to your box for some troubleshooting? > > These tests are performed using 1.5.0rc2. As for getting access, > I'll have to see if I can place this test environment somewhere safe > network-wise so you could access it. > > Not sure if it's related (was gonna keep all issues separate at > first, but perhaps they're somehow linked), but the memory usage is > something that caught my attention as well: > > vanilla pmacctd: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 31909 root 20 0 2577m 2.5g 384m R 92 32.2 487:02.71 > pmacctd: Print Plugin [traffic] > 31908 root 20 0 397m 387m 386m R 72 4.9 381:43.56 > pmacctd: Core Process [default] > > pmacctd with PF_RING support: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 14556 root 20 0 2575m 2.5g 384m S 20 32.2 51:54.62 > pmacctd: Print Plugin [traffic] > 14555 root 20 0 394m 385m 385m R 95 4.8 196:28.39 > pmacctd: Core Process [default] > > > The above were using the initial config, thus with sizes set to > 402400/402400000. When reducing those by a factor 100, I still get > this (with PF_RING support): > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 3806 root 20 0 2195m 2.1g 4536 S 31 27.4 3:07.41 > pmacctd: Print Plugin [traffic] > 3805 root 20 0 14552 5800 5240 R 97 0.1 9:54.50 > pmacctd: Core Process [default] > > This is on a Dell R210-II with an Intel(R) Xeon(R) CPU E31230 @ > 3.20GHz and 8GB RAM. Traffic being monitored is sent using pfsend on > an identically spec'ed box at the following rate: > TX rate: [current 862'368.05 pps/0.58 Gbps][average 863'190.72 > pps/0.58 Gbps][total 70'765'733'449.00 pkts] > > I don't expect this kind of traffic in my live environment under > normal circumstances, but one of the goals of this project is to > make sure everything keeps working properly during a (D)DoS as well. > > Regards, > Ruben > > > >Cheers, > >Paolo > > > >On Tue, Feb 04, 2014 at 08:31:42AM +0100, Ruben Laban wrote: > >>Hi, > >> > >>I'll start with my config: > >> > >>daemonize: true > >>pidfile: /var/run/pmacctd.pid > >>syslog: daemon > >>aggregate: dst_host > >>interface: eth5 > >>plugins: print[traffic] > >>print_output_file[traffic]: /tmp/traffic-eth5-%Y%m%d_%H%M.txt > >>print_output[traffic]: csv > >>print_refresh_time[traffic]: 60 > >>print_history[traffic]: 1m > >>plugin_buffer_size: 402400 > >>plugin_pipe_size: 402400000 > >>print_cache_entries: 999991 > >>print_output_file_append: true > >>print_history_roundoff: m > >> > >>What I observe is that every minute when the data gets flushed to > >>disk, 2 files get updated: the file for the previous minute, and the > >>file for the current minute. This leads to files containing the > >>following: > >> > >># for i in /tmp/traffic-eth5-20140204_09* ; do echo $i: ; cat $i > >>; done > >>/tmp/traffic-eth5-20140204_0900.txt: > >>DST_IP,PACKETS,BYTES > >>192.168.0.1,1496262,68828052 > >>192.168.0.1,87794632,4038553072 > >>/tmp/traffic-eth5-20140204_0901.txt: > >>DST_IP,PACKETS,BYTES > >>192.168.0.1,662553,30477438 > >>192.168.0.1,45962195,2114260970 > >>/tmp/traffic-eth5-20140204_0902.txt: > >>DST_IP,PACKETS,BYTES > >>192.168.0.1,1495840,68808640 > >> > >>(This time I'm using psend to send traffic through the monitored > >>interface, which uses a single destination IP.) > >> > >>As you can see this leads to "duplicate" entries (more than one > >>entry per aggregate). One way to get rid of the duplicates would be > >>to disable print_output_file_append, but then I'd lose data. > >> > >>I'm *guessing* that what happens is that when it's time to flush the > >>data of the previous interval to file, there's already data for the > >>current interval. And then pmacct decides to flush that part of the > >>data as well. Is this correct? > >> > >>Regards, > >>Ruben > >> > >>_______________________________________________ > >>pmacct-discussion mailing list > >>http://www.pmacct.net/#mailinglists > > > >_______________________________________________ > >pmacct-discussion mailing list > >http://www.pmacct.net/#mailinglists > _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
