Re: [pmacct-discussion] Receiving Netflow, enrich with additional informations and send as netflow to collector

2018-04-30 Thread Anthony Caiafa
It does not have native netflow. You could still use pmacct and then from
there process the data through kafka and have nifi grab the data from kafka
and enrich it from there.

On Mon, Apr 30, 2018 at 3:40 AM Tim Weippert <we...@weiti.org> wrote:

> HI Anthony,
>
> On Sat, Apr 28, 2018 at 11:09:45AM +, Anthony Caiafa wrote:
> > Phase 4 you can easily achieve with Apache nifi. You could also enrich
> the
> > data and send to another queue with Apache nifi also.
>
> Thanks for the suggestion, but as i read about nifi, it has no native
> netflow module/processor, so i think it can't be used to
> recreate/replicate a filtered netflow stream, or am i missing something
> in the docs?
>
> regards,
> tim
>
> --
> Tim Weippert
> http://weiti.org - we...@weiti.org
> GPG Fingerprint - E704 7303 6FF0 8393 ADB1  398E 67F2 94AE 5995 7DD8
>
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
>
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] Receiving Netflow, enrich with additional informations and send as netflow to collector

2018-04-28 Thread Anthony Caiafa
Phase 4 you can easily achieve with Apache nifi. You could also enrich the
data and send to another queue with Apache nifi also.

On Sat, Apr 28, 2018 at 3:55 AM Tim Weippert  wrote:

> Hi Paolo,
>
> On Fri, Apr 27, 2018 at 05:02:07PM +, Paolo Lucente wrote:
> >
> > Hi Tim,
> >
> > The first three points can be achieved no problem; the 4th is currently
> > not possible. I'm soon going to start working on a dev that would allow
> > you to achieve filtering (of networks) with the tee plugin (as it does
> > exist today for sFlow). At least in a phase 1 it will not be possible to
> > include any (BGP) enrichment in the replicated NetFlow stream.
>
> That sounds very good, if you need someone for testing the dev tree,
> just drop me a note. This is in first place very interesting for me, as
> this is an scenario i want to achieve: replicate with tee + filtering of
> networks to limit the netflow to only the customer accessible networks
> within a shared access.
>
> The enrichment is only a nice to have feature :)
>
> regards,
> tim
>
> --
> Tim Weippert
> http://weiti.org - we...@weiti.org
> GPG Fingerprint - E704 7303 6FF0 8393 ADB1  398E 67F2 94AE 5995 7DD8
>
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
>
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] ipv4 conversion to int

2018-04-21 Thread Anthony Caiafa
Ok let me test this out. Thanks!

On Sat, Apr 21, 2018 at 9:33 AM Paolo Lucente <pa...@pmacct.net> wrote:

>
> Hi Anthony,
>
> The problem with your specific lines is the name, src_host and dst_host.
> These are reserved. As Yann was proposing, use instead src_host_int and
> dst_host_int for example. In your 'aggregate' line you will also have to
> modify src_host and dst_host into src_host_int and dst_host_int. I tried
> this working for me.
>
> Paolo
>
> On Thu, Apr 19, 2018 at 03:01:24PM -0400, Anthony Caiafa wrote:
> > Yep that didnt work when i tried it. It still exported the src_host a
> string.
> >
> > On Thu, Apr 19, 2018 at 2:58 PM, Paolo Lucente <pa...@pmacct.net> wrote:
> > >
> > > Hi Anthony,
> > >
> > > Yes, nice tip from Yann actually - i'm going to document it :) This
> > > would work and is portable across all plugins, you can give it a try.
> > > Specifically for the SQL plugins, which i understand is not your case,
> > > a sql_num_hosts feature exists - failing the custom primitive approach
> > > for any unforeseen reason, we could consider a porting of this feature.
> > >
> > > Paolo
> > >
> > > On Thu, Apr 19, 2018 at 11:04:24AM -0400, Anthony Caiafa wrote:
> > >> So this should technically work?
> > >>
> > >> name=src_hostfield_type=8 len=4   semantics=u_int
> > >> name=dst_hostfield_type=12len=4   semantics=u_int
> > >>
> > >> Going against the direct keys instead of creating a new one for
> src_host_int.
> > >>
> > >> On Thu, Apr 19, 2018 at 10:05 AM, Anthony Caiafa <2600...@gmail.com>
> wrote:
> > >> > yeah backend is clickhouse and it has a similar function. However
> > >> > conversion for range queries is meh. might as well store as an int.
> > >> >
> > >> > On Thu, Apr 19, 2018 at 7:21 AM, Karl O. Pinc <k...@meme.com> wrote:
> > >> >> On Thu, 19 Apr 2018 07:30:12 +
> > >> >> Yann Belin <y.belin...@gmail.com> wrote:
> > >> >>
> > >> >>> As far as I know it doesn't but if you use nfacctd, you can easily
> > >> >>> define your own primitives to do the same job:
> > >> >>
> > >> >>> On Thu, Apr 19, 2018 at 12:14 AM Anthony Caiafa <
> 2600...@gmail.com>
> > >> >>> wrote:
> > >> >>>
> > >> >>> > Does this feature currently exist? Having the ability to convert
> > >> >>> > the ipv4 key field to an int?
> > >> >>
> > >> >> Another option would be to save your data in PostgreSQL
> > >> >> and use the ip address data type, converting from there
> > >> >> on output if necessary.
> > >> >>
> > >> >> https://www.postgresql.org/docs/10/static/datatype-net-types.html
> > >> >>
> > >> >> Regards,
> > >> >>
> > >> >> Karl <k...@meme.com>
> > >> >> Free Software:  "You don't pay back, you pay forward."
> > >> >>  -- Robert A. Heinlein
> > >> >>
> > >> >> ___
> > >> >> pmacct-discussion mailing list
> > >> >> http://www.pmacct.net/#mailinglists
> > >>
> > >> ___
> > >> pmacct-discussion mailing list
> > >> http://www.pmacct.net/#mailinglists
> > >
> > > ___
> > > pmacct-discussion mailing list
> > > http://www.pmacct.net/#mailinglists
>
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] ipv4 conversion to int

2018-04-19 Thread Anthony Caiafa
Yep that didnt work when i tried it. It still exported the src_host a string.

On Thu, Apr 19, 2018 at 2:58 PM, Paolo Lucente <pa...@pmacct.net> wrote:
>
> Hi Anthony,
>
> Yes, nice tip from Yann actually - i'm going to document it :) This
> would work and is portable across all plugins, you can give it a try.
> Specifically for the SQL plugins, which i understand is not your case,
> a sql_num_hosts feature exists - failing the custom primitive approach
> for any unforeseen reason, we could consider a porting of this feature.
>
> Paolo
>
> On Thu, Apr 19, 2018 at 11:04:24AM -0400, Anthony Caiafa wrote:
>> So this should technically work?
>>
>> name=src_hostfield_type=8 len=4   semantics=u_int
>> name=dst_hostfield_type=12len=4   semantics=u_int
>>
>> Going against the direct keys instead of creating a new one for src_host_int.
>>
>> On Thu, Apr 19, 2018 at 10:05 AM, Anthony Caiafa <2600...@gmail.com> wrote:
>> > yeah backend is clickhouse and it has a similar function. However
>> > conversion for range queries is meh. might as well store as an int.
>> >
>> > On Thu, Apr 19, 2018 at 7:21 AM, Karl O. Pinc <k...@meme.com> wrote:
>> >> On Thu, 19 Apr 2018 07:30:12 +
>> >> Yann Belin <y.belin...@gmail.com> wrote:
>> >>
>> >>> As far as I know it doesn't but if you use nfacctd, you can easily
>> >>> define your own primitives to do the same job:
>> >>
>> >>> On Thu, Apr 19, 2018 at 12:14 AM Anthony Caiafa <2600...@gmail.com>
>> >>> wrote:
>> >>>
>> >>> > Does this feature currently exist? Having the ability to convert
>> >>> > the ipv4 key field to an int?
>> >>
>> >> Another option would be to save your data in PostgreSQL
>> >> and use the ip address data type, converting from there
>> >> on output if necessary.
>> >>
>> >> https://www.postgresql.org/docs/10/static/datatype-net-types.html
>> >>
>> >> Regards,
>> >>
>> >> Karl <k...@meme.com>
>> >> Free Software:  "You don't pay back, you pay forward."
>> >>  -- Robert A. Heinlein
>> >>
>> >> ___
>> >> pmacct-discussion mailing list
>> >> http://www.pmacct.net/#mailinglists
>>
>> ___
>> pmacct-discussion mailing list
>> http://www.pmacct.net/#mailinglists
>
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] ipv4 conversion to int

2018-04-19 Thread Anthony Caiafa
yeah backend is clickhouse and it has a similar function. However
conversion for range queries is meh. might as well store as an int.

On Thu, Apr 19, 2018 at 7:21 AM, Karl O. Pinc <k...@meme.com> wrote:
> On Thu, 19 Apr 2018 07:30:12 +
> Yann Belin <y.belin...@gmail.com> wrote:
>
>> As far as I know it doesn't but if you use nfacctd, you can easily
>> define your own primitives to do the same job:
>
>> On Thu, Apr 19, 2018 at 12:14 AM Anthony Caiafa <2600...@gmail.com>
>> wrote:
>>
>> > Does this feature currently exist? Having the ability to convert
>> > the ipv4 key field to an int?
>
> Another option would be to save your data in PostgreSQL
> and use the ip address data type, converting from there
> on output if necessary.
>
> https://www.postgresql.org/docs/10/static/datatype-net-types.html
>
> Regards,
>
> Karl <k...@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>  -- Robert A. Heinlein
>
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] ipv4 conversion to int

2018-04-19 Thread Anthony Caiafa
Ah true. Let me give this a whirl.

On Thu, Apr 19, 2018 at 3:32 AM Yann Belin <y.belin...@gmail.com> wrote:

> As far as I know it doesn't but if you use nfacctd, you can easily define
> your own primitives to do the same job:
>
> name=src_host_intfield_type=8 len=4   semantics=u_int
> name=dst_host_intfield_type=12len=4   semantics=u_int
>
> Then, you can use those primitives instead of the standard ones in your
> config.
>
> On Thu, Apr 19, 2018 at 12:14 AM Anthony Caiafa <2600...@gmail.com> wrote:
>
>> Does this feature currently exist? Having the ability to convert the ipv4
>> key field to an int?
>>
> ___
>> pmacct-discussion mailing list
>> http://www.pmacct.net/#mailinglists
>
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

[pmacct-discussion] ipv4 conversion to int

2018-04-18 Thread Anthony Caiafa
Does this feature currently exist? Having the ability to convert the ipv4
key field to an int?
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] pmacct and pnda.io integration

2018-03-26 Thread Anthony Caiafa
Its a really simple and solid opensource product. Absolutely worth the
efforts since it is extremely powerful.

On Mon, Mar 26, 2018 at 9:44 AM, Jaime Botello <jbote...@riotgames.com>
wrote:

> Hi Anthony,
>
> Not familiar with it, thank you.  I will bring this to our team working on
> figuring out how to remove logstash from the workflow.
>
> --Jaime
>
>
>
> On Mon, Mar 26, 2018 at 6:27 AM, Anthony Caiafa <2600...@gmail.com> wrote:
>
>> Just adding 2 cents here. It seems like quite a few steps and going back
>> and forth to kafka. You should look at Apache NIFI which will take the data
>> from pmacct and it will cut down on all of those steps.
>>
>> On Mon, Mar 26, 2018 at 9:08 AM Jaime Botello <jbote...@riotgames.com>
>> wrote:
>>
>>> Hi Paolo,
>>>
>>> Outside access to the environment is going to be difficult since is part
>>> of our production environment, however, we may be able to arrange some
>>> remote sessions if that's something that may work.
>>>
>>> Having said that, we were able to find a workaround that works as
>>> follows:
>>>
>>> Since pmacct can't serialize the data to something that pnda.io would
>>> understand, we set up a logstash instance that serves as message
>>> translation between pmacct and pnda.io.
>>>
>>>
>>>
>>> Logstash is configured to use pnda-avro codec plugin that supports
>>> ArrayByteSerialization.  This seems to be working for now and it will
>>> provide us with some time to figure out how we can integrate
>>> pmacct directly with pnda.io so we can increase the overall throughput
>>> of the system and maximize the efficiency.
>>>
>>> If there's any interest in the details, I can share some of the
>>> documentation we are working on right now.
>>>
>>> thank you
>>>
>>> --Jaime
>>>
>>> On Sat, Mar 24, 2018 at 1:28 PM, Paolo Lucente <pa...@pmacct.net> wrote:
>>>
>>>>
>>>> Hey Jaime,
>>>>
>>>> What you say does make sense to me and would be up to this dev. Can i
>>>> ask you if it would be a possibility to access your deployment (since i
>>>> do not have the PNDA framework deployed anywhere)? It would make easier
>>>> development and subsequent testing. If yes, we can follow up privately.
>>>>
>>>> Paolo
>>>>
>>>>
>>>> On Wed, Mar 21, 2018 at 07:56:42PM -0700, Jaime Botello wrote:
>>>> > Hey Paolo,
>>>> >
>>>> > I was thinking about this after reading a little bit more on how data
>>>> is
>>>> > deserialized by pnda.io
>>>> >
>>>> > For example, if I download the file from pnda hdfs and read it using
>>>> avro
>>>> > tools, you can see pnda(kafka) was not able to deserialize the data.
>>>> >  Since Pnda use byte array deserialization, and by reading their
>>>> logstash
>>>> > integration notes, they clearly mention they use byte array
>>>> serialization,
>>>> > don't you think we could fix this by just adding byte array
>>>> serialization
>>>> > into pmacct kafka plugin?
>>>> >
>>>> > Let me know if this make sense.
>>>> >
>>>> > thanks
>>>> >
>>>> > ubuntu@ip-10-180-221-47:~/datasets$ java -jar avro-tools-1.8.2.jar
>>>> tojson
>>>> > f0f01acf-5011-42ec-90b0-c18f21e4e2ab.avro
>>>> > log4j:WARN No appenders could be found for logger
>>>> > (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>>>> > log4j:WARN Please initialize the log4j system properly.
>>>> > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig
>>>> for
>>>> > more info.
>>>> > {"topic":"netflow","timestamp":1521676850049,"reason":{"string":"*Unable
>>>> to
>>>> > deserialize data*"},"payload":"{\"event_type\": \"purge\",
>>>> \"as_src\":
>>>> > 6507, \"as_dst\": 5739, \"peer_ip_src\": \"x.x.x.x\", \"peer_ip_dst\":
>>>> > \"\", \"iface_in\": 654, \"iface_out\": 659, \"ip_src\": \"x.x.x.x\",
>>>> > \"net_src\": \"x.x.x.x\", \"ip_dst\": \&

Re: [pmacct-discussion] pmacct + ELK made easy?

2018-03-03 Thread Anthony Caiafa
Depending on your level of netflow you may have to look for an alternative
backend. I am currently working on a post that describes how I am using
pmacct to process about 100Billion records a day and storing it for
visualization with superset.

On Sat, Mar 3, 2018 at 11:15 AM Paolo Lucente <pa...@pmacct.net> wrote:

>
> Anthony is correct. The incarnation of that blog entry about pmacct +
> ELK is the pmacct-to-elasticsearch project that you can find on GitHub:
>
> https://github.com/pierky/pmacct-to-elasticsearch
>
> Also here you can find a guide on how to integrate pmacct with InfluxDB
> (on top of the same blog entry that Anthony already referenced about
> ELK):
>
> https://github.com/pmacct/pmacct/wiki/External-Links
>
> Paolo
>
> On Sat, Mar 03, 2018 at 03:30:38PM +, Anthony Caiafa wrote:
> > It seems you can probably build one based off these two
> >
> >
> https://blog.pierky.com/integration-of-pmacct-with-elasticsearch-and-kibana/
> >
> >
> https://blogs.cisco.com/security/step-by-step-setup-of-elk-for-netflow-analytics
> >
> >
> > I am sure with a little more is googling you’ll be able to find something
> > or put a post together.
> >
> > On Sat, Mar 3, 2018 at 9:12 AM Jon Nistor <nis...@snickers.org> wrote:
> >
> > > That would be really awesome if there were a guide :>
> > >
> > >
> > > From: Mike Hammett <pmacct-discuss...@ics-il.net>
> > > <pmacct-discuss...@ics-il.net>
> > > Reply: pmacct-discussion@pmacct.net <pmacct-discussion@pmacct.net>
> > > <pmacct-discussion@pmacct.net>
> > > Date: March 3, 2018 at 9:03:00 AM
> > > To: pmacct-discussion@pmacct.net <pmacct-discussion@pmacct.net>
> > > <pmacct-discussion@pmacct.net>
> > > Subject:  [pmacct-discussion] pmacct + ELK made easy?
> > >
> > > Anyone know of a good A - Z pmacct - ELK stack guide? Debian preferred,
> > > but not required.
> > >
> > >
> > >
> > >
> > > -
> > > Mike Hammett
> > > Intelligent Computing Solutions
> > > http://www.ics-il.com
> > > <https://www.facebook.com/ICSIL>
> > > <https://plus.google.com/+IntelligentComputingSolutionsDeKalb>
> > > <https://www.linkedin.com/company/intelligent-computing-solutions>
> > > <https://twitter.com/ICSIL>
> > > Midwest Internet Exchange
> > > http://www.midwest-ix.com
> > > <https://www.facebook.com/mdwestix>
> > > <https://www.linkedin.com/company/midwest-internet-exchange>
> > > <https://twitter.com/mdwestix>
> > > ___
> > > pmacct-discussion mailing list
> > > http://www.pmacct.net/#mailinglists
> > >
> > > ___
> > > pmacct-discussion mailing list
> > > http://www.pmacct.net/#mailinglists
>
> > ___
> > pmacct-discussion mailing list
> > http://www.pmacct.net/#mailinglists
>
>
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] pmacct + ELK made easy?

2018-03-03 Thread Anthony Caiafa
It seems you can probably build one based off these two

https://blog.pierky.com/integration-of-pmacct-with-elasticsearch-and-kibana/

https://blogs.cisco.com/security/step-by-step-setup-of-elk-for-netflow-analytics


I am sure with a little more is googling you’ll be able to find something
or put a post together.

On Sat, Mar 3, 2018 at 9:12 AM Jon Nistor  wrote:

> That would be really awesome if there were a guide :>
>
>
> From: Mike Hammett 
> 
> Reply: pmacct-discussion@pmacct.net 
> 
> Date: March 3, 2018 at 9:03:00 AM
> To: pmacct-discussion@pmacct.net 
> 
> Subject:  [pmacct-discussion] pmacct + ELK made easy?
>
> Anyone know of a good A - Z pmacct - ELK stack guide? Debian preferred,
> but not required.
>
>
>
>
> -
> Mike Hammett
> Intelligent Computing Solutions
> http://www.ics-il.com
> 
> 
> 
> 
> Midwest Internet Exchange
> http://www.midwest-ix.com
> 
> 
> 
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
>
> ___
> pmacct-discussion mailing list
> http://www.pmacct.net/#mailinglists
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] pmacct performance

2017-11-21 Thread Anthony Caiafa
Yep so it looks like everytime kafka_history runs no matter what
interval you put it on it will crash pmacct and restart the service.

On Sat, Nov 18, 2017 at 9:27 AM, Anthony Caiafa <2600...@gmail.com> wrote:
> Sounds good. I’ll be sending out some data to you.
>
> On Sat, Nov 18, 2017 at 9:25 AM Paolo Lucente <pa...@pmacct.net> wrote:
>>
>>
>> Hi Anthony,
>>
>> Keep me posted on the ordering part. Wrt the complete drop in the
>> service, as you described in your original email, i have little info to
>> comment: let's say it should never happen but i don't know to this point
>> if it's a crash or a graceful shutdown with some message in the logs. If
>> you wish, we can take this further and you could start from this section
>> of doc about suspect of crashes:
>>
>> https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1994-L2013
>>
>> Any output from gdb and such, you can freely take it off list and
>> unicast to me directly. We can then summarise things back on list.
>>
>> Paolo
>>
>> On Fri, Nov 17, 2017 at 10:41:40AM -0500, Anthony Caiafa wrote:
>> > Hi!..  so i have the load spread between a 3 machines and 2 ports per
>> > box. the biggest thing in the netflow data is the ordering for me. I
>> > guess where i am still curious is would either of those settings be
>> > causing the complete drop in the service where it starts and stops
>> > every 5 minutes on the dot? I am going to play around with the times
>> > on it to see if it is one of those settings. I will eventually have to
>> > increase this to about 2-4m flows per second so maybe the replicator
>> > is the best way forward.
>> >
>> > On Fri, Nov 17, 2017 at 9:47 AM, Paolo Lucente <pa...@pmacct.net> wrote:
>> > >
>> > > Hi Anthony,
>> > >
>> > > I map the word 'message' to 'flow' and not to NetFlow packet, please
>> > > correct me if this assumption is wrong. 55m flows/min makes it roughly
>> > > 1m flows/sec. I would not recommend stretching a single nfacctd daemon
>> > > beyond beyond 200K flows/sec and the beauty of NetFlow, being UDP, is
>> > > that it can be easily scaled horizontally. For a start, details and
>> > > complexity may vary from use-case to use-case, I would hence recommend
>> > > to look in the following direction: point all NetFlow to a single IP/
>> > > port where a nfacctd in replicator mode is listening. You should test
>> > > it being able to absorb the full feed on your CPU resources. Then you
>> > > replicate to nfacctd collectors downstream parts of the full feed, ie.
>> > > you can instantiate with some headroom around 6-8 nfacctd collectors.
>> > > You can balance the incoming NetFlow packets using round-robin or
>> > > assigning flow exporters to flow collectors or with some hashing. Here
>> > > is how to start with it:
>> > >
>> > > https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1384-L1445
>> > >
>> > > Of course you can do the same with your load-balancer of preference.
>> > >
>> > > Paolo
>> > >
>> > > On Thu, Nov 16, 2017 at 01:16:48PM -0500, Anthony Caiafa wrote:
>> > >> Hi! So my usecase may be slightly larger than most. I am processing
>> > >> 1:1
>> > >> netflow data for a larger infrastructure. We are receiving about
>> > >> 55million
>> > >> messages a minute which isn’t much but through pmacct it seems to not
>> > >> like
>> > >> it so much. I have pmacct scheduled with nomad running across a few
>> > >> machines and 2 designated ports accepting the flow traffic and
>> > >> outputting
>> > >> those to kafka.
>> > >>
>> > >> About every 5m or so pmacct dies and restarts basically dropping all
>> > >> traffic for a short period of time. The two configurations i have
>> > >> that are
>> > >> doing anything every 5 minutes are:
>> > >>
>> > >> kafka_refresh_time[name]: 300
>> > >> kafka_history[name]: 5m
>> > >>
>> > >>
>> > >> So i am not sure if its one of these or not since the logs only
>> > >> indicate
>> > >> that it lost a connection to kafka and thats about it.
>> > >
>> > >> ___
>> > >> pmacct-discussion mailing list
>> > >> http://www.pmacct.net/#mailinglists
>> > >
>> > >
>> > > ___
>> > > pmacct-discussion mailing list
>> > > http://www.pmacct.net/#mailinglists

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] pmacct performance

2017-11-18 Thread Anthony Caiafa
Sounds good. I’ll be sending out some data to you.

On Sat, Nov 18, 2017 at 9:25 AM Paolo Lucente <pa...@pmacct.net> wrote:

>
> Hi Anthony,
>
> Keep me posted on the ordering part. Wrt the complete drop in the
> service, as you described in your original email, i have little info to
> comment: let's say it should never happen but i don't know to this point
> if it's a crash or a graceful shutdown with some message in the logs. If
> you wish, we can take this further and you could start from this section
> of doc about suspect of crashes:
>
> https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1994-L2013
>
> Any output from gdb and such, you can freely take it off list and
> unicast to me directly. We can then summarise things back on list.
>
> Paolo
>
> On Fri, Nov 17, 2017 at 10:41:40AM -0500, Anthony Caiafa wrote:
> > Hi!..  so i have the load spread between a 3 machines and 2 ports per
> > box. the biggest thing in the netflow data is the ordering for me. I
> > guess where i am still curious is would either of those settings be
> > causing the complete drop in the service where it starts and stops
> > every 5 minutes on the dot? I am going to play around with the times
> > on it to see if it is one of those settings. I will eventually have to
> > increase this to about 2-4m flows per second so maybe the replicator
> > is the best way forward.
> >
> > On Fri, Nov 17, 2017 at 9:47 AM, Paolo Lucente <pa...@pmacct.net> wrote:
> > >
> > > Hi Anthony,
> > >
> > > I map the word 'message' to 'flow' and not to NetFlow packet, please
> > > correct me if this assumption is wrong. 55m flows/min makes it roughly
> > > 1m flows/sec. I would not recommend stretching a single nfacctd daemon
> > > beyond beyond 200K flows/sec and the beauty of NetFlow, being UDP, is
> > > that it can be easily scaled horizontally. For a start, details and
> > > complexity may vary from use-case to use-case, I would hence recommend
> > > to look in the following direction: point all NetFlow to a single IP/
> > > port where a nfacctd in replicator mode is listening. You should test
> > > it being able to absorb the full feed on your CPU resources. Then you
> > > replicate to nfacctd collectors downstream parts of the full feed, ie.
> > > you can instantiate with some headroom around 6-8 nfacctd collectors.
> > > You can balance the incoming NetFlow packets using round-robin or
> > > assigning flow exporters to flow collectors or with some hashing. Here
> > > is how to start with it:
> > >
> > > https://github.com/pmacct/pmacct/blob/master/QUICKSTART#L1384-L1445
> > >
> > > Of course you can do the same with your load-balancer of preference.
> > >
> > > Paolo
> > >
> > > On Thu, Nov 16, 2017 at 01:16:48PM -0500, Anthony Caiafa wrote:
> > >> Hi! So my usecase may be slightly larger than most. I am processing
> 1:1
> > >> netflow data for a larger infrastructure. We are receiving about
> 55million
> > >> messages a minute which isn’t much but through pmacct it seems to not
> like
> > >> it so much. I have pmacct scheduled with nomad running across a few
> > >> machines and 2 designated ports accepting the flow traffic and
> outputting
> > >> those to kafka.
> > >>
> > >> About every 5m or so pmacct dies and restarts basically dropping all
> > >> traffic for a short period of time. The two configurations i have
> that are
> > >> doing anything every 5 minutes are:
> > >>
> > >> kafka_refresh_time[name]: 300
> > >> kafka_history[name]: 5m
> > >>
> > >>
> > >> So i am not sure if its one of these or not since the logs only
> indicate
> > >> that it lost a connection to kafka and thats about it.
> > >
> > >> ___
> > >> pmacct-discussion mailing list
> > >> http://www.pmacct.net/#mailinglists
> > >
> > >
> > > ___
> > > pmacct-discussion mailing list
> > > http://www.pmacct.net/#mailinglists
>
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

[pmacct-discussion] pmacct performance

2017-11-16 Thread Anthony Caiafa
Hi! So my usecase may be slightly larger than most. I am processing 1:1
netflow data for a larger infrastructure. We are receiving about 55million
messages a minute which isn’t much but through pmacct it seems to not like
it so much. I have pmacct scheduled with nomad running across a few
machines and 2 designated ports accepting the flow traffic and outputting
those to kafka.

About every 5m or so pmacct dies and restarts basically dropping all
traffic for a short period of time. The two configurations i have that are
doing anything every 5 minutes are:

kafka_refresh_time[name]: 300
kafka_history[name]: 5m


So i am not sure if its one of these or not since the logs only indicate
that it lost a connection to kafka and thats about it.
___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists