Sounds like you have already have the DB server hardware. It may be a good idea to simulate the data flows to and from your DB. Some scripts that insert data at different constant rates and/or intermittent as it comes from nfacctd normally generate the input to the DB. At the same time you prepare the next steps of processing with the fake data. This should reveal bottlenecks and give you the chance to address them before they appear in the live system. E.g. using multiple Netflow collectors that write to the same tablespace may lock each other and decreasing insert performance. Same applies for reading the written data for further processing. Reducing the locks can be challenging, splitting the tablespace with partitioning or per collector separated inbound tables can help.
Good luck! Mario > -----Original Message----- > From: pmacct-discussion [mailto:pmacct-discussion-boun...@pmacct.net] > On Behalf Of Stephen Clark > Sent: Thursday, August 18, 2016 2:24 PM > To: email@example.com > Subject: Re: [pmacct-discussion] collecting large number of netflows > > On 08/17/2016 08:38 AM, Jentsch, Mario wrote: > > Hey Steve, > > > > that question can't be answered without a lot of assumptions about the > details of your project and we made the experience that even with project > details it is a hard thing to predict due to the nature of network traffic > patterns. Pmacct (namely nfacctd) can handle that number of flows - even > with only one instance - and is most probably not the bottleneck. If it is > possible what you plan to do, depends on questions like "how many records > per timebin do you have after aggregation in nfacctd" - this is what your > backend DB has to handle and "how is this data processed later on?" - this > has more or less impact on DB performance and the time it takes to create > reports or feed any user interfaces. > > > > Regards, > > Mario > Hi Mario, > > Thanks for the response. We will be collecting data from about 200 probes. > This > is a new endeavor so I guess we be learning on the fly. We are planning on > using > fsrc sampling feature set at 200000 flows per minute with inserts only into a > postgresql 9.4 DB running on CentOS 6.8 in VMware on a hefty Cisco UCS > system. > > Regards, > Steve > >> -----Original Message----- > >> From: pmacct-discussion [mailto:pmacct-discussion- > boun...@pmacct.net] > >> On Behalf Of Stephen Clark > >> Sent: Thursday, August 04, 2016 5:01 PM > >> To: firstname.lastname@example.org > >> Subject: [pmacct-discussion] collecting large number of netflows > >> > >> Hi List, > >> > >> I am looking to collect a large number of netflow records, on the order of > a > >> 100 > >> million a day, > >> and store them in a postgres DB. Has anyone done this or something > similar > >> using > >> pmacct? > >> > >> Thanks, > >> Steve > >> > >> > > > _______________________________________________ > pmacct-discussion mailing list > http://www.pmacct.net/#mailinglists _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists