Re: [pmacct-discussion] pmacct performance
Hi Anthony, What version are you using? You can confirm this with a nfacctd -V. Also, please post your integral config. I use kafka_history myself in production so i lean towards: either you are running some code from master that unluckily is not working or some combination of directives of your configuration is triggering a bug. Either case what i was pointing out in my previous email would give a clue where the code breaks - which could be useful info especially in the latter case. Paolo On Tue, Nov 21, 2017 at 02:34:33PM -0500, Anthony Caiafa wrote: > Yep so it looks like everytime kafka_history runs no matter what > interval you put it on it will crash pmacct and restart the service. > > On Sat, Nov 18, 2017 at 9:27 AM, Anthony Caiafa <2600...@gmail.com> wrote: > > Sounds good. I’ll be sending out some data to you. > > > > On Sat, Nov 18, 2017 at 9:25 AM Paolo Lucentewrote: > >> > >> > >> Hi Anthony, > >> > >> Keep me posted on the ordering part. Wrt the complete drop in the > >> service, as you described in your original email, i have little info to > >> comment: let's say it should never happen but i don't know to this point > >> if it's a crash or a graceful shutdown with some message in the logs. If > >> you wish, we can take this further and you could start from this section > >> of doc about suspect of crashes: > >> > >> https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1994-L2013 > >> > >> Any output from gdb and such, you can freely take it off list and > >> unicast to me directly. We can then summarise things back on list. > >> > >> Paolo > >> > >> On Fri, Nov 17, 2017 at 10:41:40AM -0500, Anthony Caiafa wrote: > >> > Hi!.. so i have the load spread between a 3 machines and 2 ports per > >> > box. the biggest thing in the netflow data is the ordering for me. I > >> > guess where i am still curious is would either of those settings be > >> > causing the complete drop in the service where it starts and stops > >> > every 5 minutes on the dot? I am going to play around with the times > >> > on it to see if it is one of those settings. I will eventually have to > >> > increase this to about 2-4m flows per second so maybe the replicator > >> > is the best way forward. > >> > > >> > On Fri, Nov 17, 2017 at 9:47 AM, Paolo Lucente wrote: > >> > > > >> > > Hi Anthony, > >> > > > >> > > I map the word 'message' to 'flow' and not to NetFlow packet, please > >> > > correct me if this assumption is wrong. 55m flows/min makes it roughly > >> > > 1m flows/sec. I would not recommend stretching a single nfacctd daemon > >> > > beyond beyond 200K flows/sec and the beauty of NetFlow, being UDP, is > >> > > that it can be easily scaled horizontally. For a start, details and > >> > > complexity may vary from use-case to use-case, I would hence recommend > >> > > to look in the following direction: point all NetFlow to a single IP/ > >> > > port where a nfacctd in replicator mode is listening. You should test > >> > > it being able to absorb the full feed on your CPU resources. Then you > >> > > replicate to nfacctd collectors downstream parts of the full feed, ie. > >> > > you can instantiate with some headroom around 6-8 nfacctd collectors. > >> > > You can balance the incoming NetFlow packets using round-robin or > >> > > assigning flow exporters to flow collectors or with some hashing. Here > >> > > is how to start with it: > >> > > > >> > > https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1384-L1445 > >> > > > >> > > Of course you can do the same with your load-balancer of preference. > >> > > > >> > > Paolo > >> > > > >> > > On Thu, Nov 16, 2017 at 01:16:48PM -0500, Anthony Caiafa wrote: > >> > >> Hi! So my usecase may be slightly larger than most. I am processing > >> > >> 1:1 > >> > >> netflow data for a larger infrastructure. We are receiving about > >> > >> 55million > >> > >> messages a minute which isn’t much but through pmacct it seems to not > >> > >> like > >> > >> it so much. I have pmacct scheduled with nomad running across a few > >> > >> machines and 2 designated ports accepting the flow traffic and > >> > >> outputting > >> > >> those to kafka. > >> > >> > >> > >> About every 5m or so pmacct dies and restarts basically dropping all > >> > >> traffic for a short period of time. The two configurations i have > >> > >> that are > >> > >> doing anything every 5 minutes are: > >> > >> > >> > >> kafka_refresh_time[name]: 300 > >> > >> kafka_history[name]: 5m > >> > >> > >> > >> > >> > >> So i am not sure if its one of these or not since the logs only > >> > >> indicate > >> > >> that it lost a connection to kafka and thats about it. > >> > > > >> > >> ___ > >> > >> pmacct-discussion mailing list > >> > >> http://www.pmacct.net/#mailinglists > >> > > > >> > > > >> > > ___ > >> > > pmacct-discussion mailing list > >> > > http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Yep so it looks like everytime kafka_history runs no matter what interval you put it on it will crash pmacct and restart the service. On Sat, Nov 18, 2017 at 9:27 AM, Anthony Caiafa <2600...@gmail.com> wrote: > Sounds good. I’ll be sending out some data to you. > > On Sat, Nov 18, 2017 at 9:25 AM Paolo Lucentewrote: >> >> >> Hi Anthony, >> >> Keep me posted on the ordering part. Wrt the complete drop in the >> service, as you described in your original email, i have little info to >> comment: let's say it should never happen but i don't know to this point >> if it's a crash or a graceful shutdown with some message in the logs. If >> you wish, we can take this further and you could start from this section >> of doc about suspect of crashes: >> >> https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1994-L2013 >> >> Any output from gdb and such, you can freely take it off list and >> unicast to me directly. We can then summarise things back on list. >> >> Paolo >> >> On Fri, Nov 17, 2017 at 10:41:40AM -0500, Anthony Caiafa wrote: >> > Hi!.. so i have the load spread between a 3 machines and 2 ports per >> > box. the biggest thing in the netflow data is the ordering for me. I >> > guess where i am still curious is would either of those settings be >> > causing the complete drop in the service where it starts and stops >> > every 5 minutes on the dot? I am going to play around with the times >> > on it to see if it is one of those settings. I will eventually have to >> > increase this to about 2-4m flows per second so maybe the replicator >> > is the best way forward. >> > >> > On Fri, Nov 17, 2017 at 9:47 AM, Paolo Lucente wrote: >> > > >> > > Hi Anthony, >> > > >> > > I map the word 'message' to 'flow' and not to NetFlow packet, please >> > > correct me if this assumption is wrong. 55m flows/min makes it roughly >> > > 1m flows/sec. I would not recommend stretching a single nfacctd daemon >> > > beyond beyond 200K flows/sec and the beauty of NetFlow, being UDP, is >> > > that it can be easily scaled horizontally. For a start, details and >> > > complexity may vary from use-case to use-case, I would hence recommend >> > > to look in the following direction: point all NetFlow to a single IP/ >> > > port where a nfacctd in replicator mode is listening. You should test >> > > it being able to absorb the full feed on your CPU resources. Then you >> > > replicate to nfacctd collectors downstream parts of the full feed, ie. >> > > you can instantiate with some headroom around 6-8 nfacctd collectors. >> > > You can balance the incoming NetFlow packets using round-robin or >> > > assigning flow exporters to flow collectors or with some hashing. Here >> > > is how to start with it: >> > > >> > > https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1384-L1445 >> > > >> > > Of course you can do the same with your load-balancer of preference. >> > > >> > > Paolo >> > > >> > > On Thu, Nov 16, 2017 at 01:16:48PM -0500, Anthony Caiafa wrote: >> > >> Hi! So my usecase may be slightly larger than most. I am processing >> > >> 1:1 >> > >> netflow data for a larger infrastructure. We are receiving about >> > >> 55million >> > >> messages a minute which isn’t much but through pmacct it seems to not >> > >> like >> > >> it so much. I have pmacct scheduled with nomad running across a few >> > >> machines and 2 designated ports accepting the flow traffic and >> > >> outputting >> > >> those to kafka. >> > >> >> > >> About every 5m or so pmacct dies and restarts basically dropping all >> > >> traffic for a short period of time. The two configurations i have >> > >> that are >> > >> doing anything every 5 minutes are: >> > >> >> > >> kafka_refresh_time[name]: 300 >> > >> kafka_history[name]: 5m >> > >> >> > >> >> > >> So i am not sure if its one of these or not since the logs only >> > >> indicate >> > >> that it lost a connection to kafka and thats about it. >> > > >> > >> ___ >> > >> pmacct-discussion mailing list >> > >> http://www.pmacct.net/#mailinglists >> > > >> > > >> > > ___ >> > > pmacct-discussion mailing list >> > > http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Sounds good. I’ll be sending out some data to you. On Sat, Nov 18, 2017 at 9:25 AM Paolo Lucentewrote: > > Hi Anthony, > > Keep me posted on the ordering part. Wrt the complete drop in the > service, as you described in your original email, i have little info to > comment: let's say it should never happen but i don't know to this point > if it's a crash or a graceful shutdown with some message in the logs. If > you wish, we can take this further and you could start from this section > of doc about suspect of crashes: > > https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1994-L2013 > > Any output from gdb and such, you can freely take it off list and > unicast to me directly. We can then summarise things back on list. > > Paolo > > On Fri, Nov 17, 2017 at 10:41:40AM -0500, Anthony Caiafa wrote: > > Hi!.. so i have the load spread between a 3 machines and 2 ports per > > box. the biggest thing in the netflow data is the ordering for me. I > > guess where i am still curious is would either of those settings be > > causing the complete drop in the service where it starts and stops > > every 5 minutes on the dot? I am going to play around with the times > > on it to see if it is one of those settings. I will eventually have to > > increase this to about 2-4m flows per second so maybe the replicator > > is the best way forward. > > > > On Fri, Nov 17, 2017 at 9:47 AM, Paolo Lucente wrote: > > > > > > Hi Anthony, > > > > > > I map the word 'message' to 'flow' and not to NetFlow packet, please > > > correct me if this assumption is wrong. 55m flows/min makes it roughly > > > 1m flows/sec. I would not recommend stretching a single nfacctd daemon > > > beyond beyond 200K flows/sec and the beauty of NetFlow, being UDP, is > > > that it can be easily scaled horizontally. For a start, details and > > > complexity may vary from use-case to use-case, I would hence recommend > > > to look in the following direction: point all NetFlow to a single IP/ > > > port where a nfacctd in replicator mode is listening. You should test > > > it being able to absorb the full feed on your CPU resources. Then you > > > replicate to nfacctd collectors downstream parts of the full feed, ie. > > > you can instantiate with some headroom around 6-8 nfacctd collectors. > > > You can balance the incoming NetFlow packets using round-robin or > > > assigning flow exporters to flow collectors or with some hashing. Here > > > is how to start with it: > > > > > > https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1384-L1445 > > > > > > Of course you can do the same with your load-balancer of preference. > > > > > > Paolo > > > > > > On Thu, Nov 16, 2017 at 01:16:48PM -0500, Anthony Caiafa wrote: > > >> Hi! So my usecase may be slightly larger than most. I am processing > 1:1 > > >> netflow data for a larger infrastructure. We are receiving about > 55million > > >> messages a minute which isn’t much but through pmacct it seems to not > like > > >> it so much. I have pmacct scheduled with nomad running across a few > > >> machines and 2 designated ports accepting the flow traffic and > outputting > > >> those to kafka. > > >> > > >> About every 5m or so pmacct dies and restarts basically dropping all > > >> traffic for a short period of time. The two configurations i have > that are > > >> doing anything every 5 minutes are: > > >> > > >> kafka_refresh_time[name]: 300 > > >> kafka_history[name]: 5m > > >> > > >> > > >> So i am not sure if its one of these or not since the logs only > indicate > > >> that it lost a connection to kafka and thats about it. > > > > > >> ___ > > >> pmacct-discussion mailing list > > >> http://www.pmacct.net/#mailinglists > > > > > > > > > ___ > > > pmacct-discussion mailing list > > > http://www.pmacct.net/#mailinglists > ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Hi Anthony, Keep me posted on the ordering part. Wrt the complete drop in the service, as you described in your original email, i have little info to comment: let's say it should never happen but i don't know to this point if it's a crash or a graceful shutdown with some message in the logs. If you wish, we can take this further and you could start from this section of doc about suspect of crashes: https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1994-L2013 Any output from gdb and such, you can freely take it off list and unicast to me directly. We can then summarise things back on list. Paolo On Fri, Nov 17, 2017 at 10:41:40AM -0500, Anthony Caiafa wrote: > Hi!.. so i have the load spread between a 3 machines and 2 ports per > box. the biggest thing in the netflow data is the ordering for me. I > guess where i am still curious is would either of those settings be > causing the complete drop in the service where it starts and stops > every 5 minutes on the dot? I am going to play around with the times > on it to see if it is one of those settings. I will eventually have to > increase this to about 2-4m flows per second so maybe the replicator > is the best way forward. > > On Fri, Nov 17, 2017 at 9:47 AM, Paolo Lucentewrote: > > > > Hi Anthony, > > > > I map the word 'message' to 'flow' and not to NetFlow packet, please > > correct me if this assumption is wrong. 55m flows/min makes it roughly > > 1m flows/sec. I would not recommend stretching a single nfacctd daemon > > beyond beyond 200K flows/sec and the beauty of NetFlow, being UDP, is > > that it can be easily scaled horizontally. For a start, details and > > complexity may vary from use-case to use-case, I would hence recommend > > to look in the following direction: point all NetFlow to a single IP/ > > port where a nfacctd in replicator mode is listening. You should test > > it being able to absorb the full feed on your CPU resources. Then you > > replicate to nfacctd collectors downstream parts of the full feed, ie. > > you can instantiate with some headroom around 6-8 nfacctd collectors. > > You can balance the incoming NetFlow packets using round-robin or > > assigning flow exporters to flow collectors or with some hashing. Here > > is how to start with it: > > > > https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1384-L1445 > > > > Of course you can do the same with your load-balancer of preference. > > > > Paolo > > > > On Thu, Nov 16, 2017 at 01:16:48PM -0500, Anthony Caiafa wrote: > >> Hi! So my usecase may be slightly larger than most. I am processing 1:1 > >> netflow data for a larger infrastructure. We are receiving about 55million > >> messages a minute which isn’t much but through pmacct it seems to not like > >> it so much. I have pmacct scheduled with nomad running across a few > >> machines and 2 designated ports accepting the flow traffic and outputting > >> those to kafka. > >> > >> About every 5m or so pmacct dies and restarts basically dropping all > >> traffic for a short period of time. The two configurations i have that are > >> doing anything every 5 minutes are: > >> > >> kafka_refresh_time[name]: 300 > >> kafka_history[name]: 5m > >> > >> > >> So i am not sure if its one of these or not since the logs only indicate > >> that it lost a connection to kafka and thats about it. > > > >> ___ > >> pmacct-discussion mailing list > >> http://www.pmacct.net/#mailinglists > > > > > > ___ > > pmacct-discussion mailing list > > http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Hi Anthony, I map the word 'message' to 'flow' and not to NetFlow packet, please correct me if this assumption is wrong. 55m flows/min makes it roughly 1m flows/sec. I would not recommend stretching a single nfacctd daemon beyond beyond 200K flows/sec and the beauty of NetFlow, being UDP, is that it can be easily scaled horizontally. For a start, details and complexity may vary from use-case to use-case, I would hence recommend to look in the following direction: point all NetFlow to a single IP/ port where a nfacctd in replicator mode is listening. You should test it being able to absorb the full feed on your CPU resources. Then you replicate to nfacctd collectors downstream parts of the full feed, ie. you can instantiate with some headroom around 6-8 nfacctd collectors. You can balance the incoming NetFlow packets using round-robin or assigning flow exporters to flow collectors or with some hashing. Here is how to start with it: https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1384-L1445 Of course you can do the same with your load-balancer of preference. Paolo On Thu, Nov 16, 2017 at 01:16:48PM -0500, Anthony Caiafa wrote: > Hi! So my usecase may be slightly larger than most. I am processing 1:1 > netflow data for a larger infrastructure. We are receiving about 55million > messages a minute which isn’t much but through pmacct it seems to not like > it so much. I have pmacct scheduled with nomad running across a few > machines and 2 designated ports accepting the flow traffic and outputting > those to kafka. > > About every 5m or so pmacct dies and restarts basically dropping all > traffic for a short period of time. The two configurations i have that are > doing anything every 5 minutes are: > > kafka_refresh_time[name]: 300 > kafka_history[name]: 5m > > > So i am not sure if its one of these or not since the logs only indicate > that it lost a connection to kafka and thats about it. > ___ > pmacct-discussion mailing list > http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Hi Stathis, Since you use PF_RING, you can review an advice i gave to Joan a couple months back when he was asking how to scale up a pmacctd deployment; see specifically the replication idea i gave in the following email: https://www.mail-archive.com/pmacct-discussion@pmacct.net/msg02447.html Speaking specifically of the classification part: gut feeling is this is a bit too much resources for only a single classifier that is looking for an HTTP hostname (i'm not necessarily implying your shared object is culprit here). It would be great if we could debug/review this together. Shall we follow-up privately on this? Cheers, Paolo On Mon, Apr 07, 2014 at 11:39:24PM +0300, Stathis Gkotsis wrote: Hi Paolo, Yes, I use pfring. It is both traffic rate and classification which cause the CPU to go to 100%. If I do not use any classifiers, CPU is around 40%, then, when I enable the classifier, CPU goes above 95%. The classifier is a shared library which tries to match a series of bytes in the packet payload, basically searches for a hostname in the packet payload (I am interested in HTTP traffic). Thanks,Stathis Date: Mon, 7 Apr 2014 18:48:55 + From: pa...@pmacct.net To: pmacct-discussion@pmacct.net Subject: Re: [pmacct-discussion] pmacct performance Hi Stathis, Two questions on your current setup: 1) are you already using pmacct against a PF_RING-enabled libpcap? You made reference to this in your email; 2) Can you determine what makes CPU go to 100%? Is it traffic rate or classification? Deterimining this is key to steer further recommendations. Cheers, Paolo On Sun, Apr 06, 2014 at 08:17:07PM +0300, Stathis Gkotsis wrote: Hi all, I am using pmacctd with libpcap. My configuration is the following: daemonize: falsepcap_filter: port 80 // only interested in HTTP trafficplugin_pipe_size: 10240plugin_buffer_size: 102400aggregate: src_host,dst_host,src_port,dst_port,proto,classclassifiers: [path_to_classifier]snaplen: 500interface: anyplugins: printprint_num_protos: trueprint_cache_entries: 15485863print_output: csvprint_time_roundoff: mhdprint_output_file: file.%s.%Y%m%d-%H%M.txtprint_refresh_time: 300 I have defined one classifier and, on the machine I am using, CPU usage of the core process is close to 100%.I have read the relevant FAQ question about high CPU usage and applied what it proposes. The question now is how pmacct could cope with more traffic:- are there any other ways to optimize pmacct itself or its configuration?- I was thinking of launching multiple pmacctd instances, each instance receiving a portion of the traffic. This split could be done through BPF filter. How would you split the traffic? For example, you can split based on one bit of the IP address... The goal would be that the separate instances are balanced in terms of CPU usage.- Is pmacct compiled with all relevant gcc optimizations? Thanks,Stathis ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Hi Paolo, In your advice to Joan, I guess you refer to Libzero: http://www.ntop.org/products/pf_ring/libzero-for-dna/ , should not be too difficult to leverage this in pmacct... Regards,Stathis Date: Tue, 8 Apr 2014 16:51:00 + From: pa...@pmacct.net To: stathisgot...@hotmail.com CC: pmacct-discussion@pmacct.net Subject: Re: [pmacct-discussion] pmacct performance Hi Stathis, Since you use PF_RING, you can review an advice i gave to Joan a couple months back when he was asking how to scale up a pmacctd deployment; see specifically the replication idea i gave in the following email: https://www.mail-archive.com/pmacct-discussion@pmacct.net/msg02447.html Speaking specifically of the classification part: gut feeling is this is a bit too much resources for only a single classifier that is looking for an HTTP hostname (i'm not necessarily implying your shared object is culprit here). It would be great if we could debug/review this together. Shall we follow-up privately on this? Cheers, Paolo On Mon, Apr 07, 2014 at 11:39:24PM +0300, Stathis Gkotsis wrote: Hi Paolo, Yes, I use pfring. It is both traffic rate and classification which cause the CPU to go to 100%. If I do not use any classifiers, CPU is around 40%, then, when I enable the classifier, CPU goes above 95%. The classifier is a shared library which tries to match a series of bytes in the packet payload, basically searches for a hostname in the packet payload (I am interested in HTTP traffic). Thanks,Stathis Date: Mon, 7 Apr 2014 18:48:55 + From: pa...@pmacct.net To: pmacct-discussion@pmacct.net Subject: Re: [pmacct-discussion] pmacct performance Hi Stathis, Two questions on your current setup: 1) are you already using pmacct against a PF_RING-enabled libpcap? You made reference to this in your email; 2) Can you determine what makes CPU go to 100%? Is it traffic rate or classification? Deterimining this is key to steer further recommendations. Cheers, Paolo On Sun, Apr 06, 2014 at 08:17:07PM +0300, Stathis Gkotsis wrote: Hi all, I am using pmacctd with libpcap. My configuration is the following: daemonize: falsepcap_filter: port 80 // only interested in HTTP trafficplugin_pipe_size: 10240plugin_buffer_size: 102400aggregate: src_host,dst_host,src_port,dst_port,proto,classclassifiers: [path_to_classifier]snaplen: 500interface: anyplugins: printprint_num_protos: trueprint_cache_entries: 15485863print_output: csvprint_time_roundoff: mhdprint_output_file: file.%s.%Y%m%d-%H%M.txtprint_refresh_time: 300 I have defined one classifier and, on the machine I am using, CPU usage of the core process is close to 100%.I have read the relevant FAQ question about high CPU usage and applied what it proposes. The question now is how pmacct could cope with more traffic:- are there any other ways to optimize pmacct itself or its configuration?- I was thinking of launching multiple pmacctd instances, each instance receiving a portion of the traffic. This split could be done through BPF filter. How would you split the traffic? For example, you can split based on one bit of the IP address... The goal would be that the separate instances are balanced in terms of CPU usage.- Is pmacct compiled with all relevant gcc optimizations? Thanks,Stathis ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
06.04.2014 20:17, Stathis Gkotsis пишет: Hi all, I am using pmacctd with libpcap. Will add my 5cent. Use libpcap with high load system not good idea. You will lost more then 50% traffic. IMHO better idea use http://sourceforge.net/projects/ipt-netflow/ for create netflow and then use nfacct for analyze and record it. -- WBR, Viacheslav Dubrovskyi smime.p7s Description: Криптографическая подпись S/MIME ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Hi Stathis, Two questions on your current setup: 1) are you already using pmacct against a PF_RING-enabled libpcap? You made reference to this in your email; 2) Can you determine what makes CPU go to 100%? Is it traffic rate or classification? Deterimining this is key to steer further recommendations. Cheers, Paolo On Sun, Apr 06, 2014 at 08:17:07PM +0300, Stathis Gkotsis wrote: Hi all, I am using pmacctd with libpcap. My configuration is the following: daemonize: falsepcap_filter: port 80 // only interested in HTTP trafficplugin_pipe_size: 10240plugin_buffer_size: 102400aggregate: src_host,dst_host,src_port,dst_port,proto,classclassifiers: [path_to_classifier]snaplen: 500interface: anyplugins: printprint_num_protos: trueprint_cache_entries: 15485863print_output: csvprint_time_roundoff: mhdprint_output_file: file.%s.%Y%m%d-%H%M.txtprint_refresh_time: 300 I have defined one classifier and, on the machine I am using, CPU usage of the core process is close to 100%.I have read the relevant FAQ question about high CPU usage and applied what it proposes. The question now is how pmacct could cope with more traffic:- are there any other ways to optimize pmacct itself or its configuration?- I was thinking of launching multiple pmacctd instances, each instance receiving a portion of the traffic. This split could be done through BPF filter. How would you split the traffic? For example, you can split based on one bit of the IP address... The goal would be that the separate instances are balanced in terms of CPU usage.- Is pmacct compiled with all relevant gcc optimizations? Thanks,Stathis ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
Re: [pmacct-discussion] pmacct performance
Hi Paolo, Yes, I use pfring. It is both traffic rate and classification which cause the CPU to go to 100%. If I do not use any classifiers, CPU is around 40%, then, when I enable the classifier, CPU goes above 95%. The classifier is a shared library which tries to match a series of bytes in the packet payload, basically searches for a hostname in the packet payload (I am interested in HTTP traffic). Thanks,Stathis Date: Mon, 7 Apr 2014 18:48:55 + From: pa...@pmacct.net To: pmacct-discussion@pmacct.net Subject: Re: [pmacct-discussion] pmacct performance Hi Stathis, Two questions on your current setup: 1) are you already using pmacct against a PF_RING-enabled libpcap? You made reference to this in your email; 2) Can you determine what makes CPU go to 100%? Is it traffic rate or classification? Deterimining this is key to steer further recommendations. Cheers, Paolo On Sun, Apr 06, 2014 at 08:17:07PM +0300, Stathis Gkotsis wrote: Hi all, I am using pmacctd with libpcap. My configuration is the following: daemonize: falsepcap_filter: port 80 // only interested in HTTP trafficplugin_pipe_size: 10240plugin_buffer_size: 102400aggregate: src_host,dst_host,src_port,dst_port,proto,classclassifiers: [path_to_classifier]snaplen: 500interface: anyplugins: printprint_num_protos: trueprint_cache_entries: 15485863print_output: csvprint_time_roundoff: mhdprint_output_file: file.%s.%Y%m%d-%H%M.txtprint_refresh_time: 300 I have defined one classifier and, on the machine I am using, CPU usage of the core process is close to 100%.I have read the relevant FAQ question about high CPU usage and applied what it proposes. The question now is how pmacct could cope with more traffic:- are there any other ways to optimize pmacct itself or its configuration?- I was thinking of launching multiple pmacctd instances, each instance receiving a portion of the traffic. This split could be done through BPF filter. How would you split the traffic? For example, you can split based on one bit of the IP address... The goal would be that the separate instances are balanced in terms of CPU usage.- Is pmacct compiled with all relevant gcc optimizations? Thanks,Stathis ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists ___ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists