I am constructing a collector that will receive a significant volume of non-sampled flows. The primary goal is to anonymize the IP addresses in the flows, but preserve a one-to-one relationship for each /32 address and then deliver these anonymized flows to a research collector. Yes, I realize this results in something that is not completely anonymous.
Nevertheless, I'm pondering how best to perform this process in order to turn around the anonymized flows to research collector as quickly as possible with the hardware available. It seems there are two basic approaches. One is to pipe received flows to an anonymization process, which in turn sends it output to flow-send via another pipe. Another alternative is to fully capture the flows, run the anonymization process on disk stored flows, which then pipes the output to flow-send. The latter option would seem to result in a large I/O penalty, but I wonder if it offers an advantage in reliability for flow delivery to the research collector. Note that non-anonymized flows will be captured to disk on the initial collector for non-research storage and analysis purposes anyway, so disk writes are already going to be done. Some periodic disk reads will be done using other utilities like those in the flow-tools package or FlowScan. I'd like to avoid having to add additional hardware for handling the anonymization or local storage/analysis processes. The collector receiving the initial flows is a recent Intel dual processor box with 12 GB of RAM and a few hundred gigabytes of available disk. I envision running a flow-fanout process that hands one copy of the flows to the flow-capture process for non-research purposes locally and hands another copy to flow-receive, which sends to an anonymization process, which in turn sends directly to flow-send for delivery to the research collector. That seems like the easiest and most scaleable approach with what I have to work with. Thoughts about this or suggestions for a design that is as robust as can be? John _______________________________________________ Flow-tools mailing list [EMAIL PROTECTED] http://mailman.splintered.net/mailman/listinfo/flow-tools
