I am constructing a collector that will receive a significant volume of
non-sampled flows.  The primary goal is to anonymize the IP addresses in
the flows, but preserve a one-to-one relationship for each /32 address
and then deliver these anonymized flows to a research collector.  Yes,
I realize this results in something that is not completely anonymous.

Nevertheless, I'm pondering how best to perform this process in order
to turn around the anonymized flows to research collector as quickly
as possible with the hardware available.  It seems there are two basic
approaches.  One is to pipe received flows to an anonymization process,
which in turn sends it output to flow-send via another pipe.

Another alternative is to fully capture the flows, run the anonymization
process on disk stored flows, which then pipes the output to flow-send.

The latter option would seem to result in a large I/O penalty, but I
wonder if it offers an advantage in reliability for flow delivery to the
research collector.

Note that non-anonymized flows will be captured to disk on the initial
collector for non-research storage and analysis purposes anyway, so
disk writes are already going to be done.  Some periodic disk reads
will be done using other utilities like those in the flow-tools package
or FlowScan.  I'd like to avoid having to add additional hardware for
handling the anonymization or local storage/analysis processes.  The
collector receiving the initial flows is a recent Intel dual processor
box with 12 GB of RAM and a few hundred gigabytes of available disk.

I envision running a flow-fanout process that hands one copy of the
flows to the flow-capture process for non-research purposes locally
and hands another copy to flow-receive, which sends to an anonymization
process, which in turn sends directly to flow-send for delivery to the
research collector.  That seems like the easiest and most scaleable
approach with what I have to work with.

Thoughts about this or suggestions for a design that is as robust as
can be?

John
_______________________________________________
Flow-tools mailing list
[EMAIL PROTECTED]
http://mailman.splintered.net/mailman/listinfo/flow-tools

Reply via email to