Sounds like you have already have the DB server hardware.
It may be a good idea to simulate the data flows to and from your DB. Some 
scripts that insert data at different constant rates and/or intermittent as it 
comes from nfacctd normally generate the input to the DB. At the same time you 
prepare the next steps of processing with the fake data. This should reveal 
bottlenecks and give you the chance to address them before they appear in the 
live system.
E.g. using multiple Netflow collectors that write to the same tablespace may 
lock each other and decreasing insert performance. Same applies for reading the 
written data for further processing. Reducing the locks can be challenging, 
splitting the tablespace with partitioning or per collector separated inbound 
tables can help.

Good luck!
Mario

> -----Original Message-----
> From: pmacct-discussion [mailto:pmacct-discussion-boun...@pmacct.net]
> On Behalf Of Stephen Clark
> Sent: Thursday, August 18, 2016 2:24 PM
> To: pmacct-discussion@pmacct.net
> Subject: Re: [pmacct-discussion] collecting large number of netflows
> 
> On 08/17/2016 08:38 AM, Jentsch, Mario wrote:
> > Hey Steve,
> >
> > that question can't be answered without a lot of assumptions about the
> details of your project and we made the experience that even with project
> details it is a hard thing to predict due to the nature of network traffic
> patterns. Pmacct (namely nfacctd) can handle that number of flows - even
> with only one instance - and is most probably not the bottleneck. If it is
> possible what you plan to do, depends on questions like "how many records
> per timebin do you have after aggregation in nfacctd" - this is what your
> backend DB has to handle and "how is this data processed later on?" - this
> has more or less impact on DB performance and the time it takes to create
> reports or feed any user interfaces.
> >
> > Regards,
> > Mario
> Hi Mario,
> 
> Thanks for the response. We will be collecting data from about 200 probes.
> This
> is a new endeavor so I guess we be learning on the fly. We are planning on
> using
> fsrc sampling feature set at 200000 flows per minute with inserts only into a
> postgresql 9.4 DB running on CentOS 6.8 in VMware on a hefty Cisco UCS
> system.
> 
> Regards,
> Steve
> >> -----Original Message-----
> >> From: pmacct-discussion [mailto:pmacct-discussion-
> boun...@pmacct.net]
> >> On Behalf Of Stephen Clark
> >> Sent: Thursday, August 04, 2016 5:01 PM
> >> To: pmacct-discussion@pmacct.net
> >> Subject: [pmacct-discussion] collecting large number of netflows
> >>
> >> Hi List,
> >>
> >> I am looking to collect a large number of netflow records, on the order of
> a
> >> 100
> >> million a day,
> >> and store them in a postgres DB. Has anyone done this or something
> similar
> >> using
> >> pmacct?
> >>
> >> Thanks,
> >> Steve
> >>
> >>
> 
> 
> _______________________________________________
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Reply via email to