Stefano, >> An overwhelming majority of network based IDSs use only spatial >> information present in packet headers. > > "spatial" information ? if you mean "IP addresses", then
I took "spatial" information to mean connection or packet header data -- more than just IP addresses, but lacking the unstructured data portions. > 1) your statement is definitely not true and Actually, I think it is: the majority of unique NIDSs that I am familiar with were built to use the KDD Cup '99 dataset. I pray none of those systems are actually used in production anywhere. Let's face it, only a handful of signature based network intrusion detectors were ever built. After Marty released Snort to the community, there really hasn't been a need to build another. Sure, a couple have been so that they wouldn't be "encumbered" by the open source license, but there really haven't been any major changes to signature based detection in the past decade (just thousands of tweaks). Most anomaly or machine learning based detectors will only work with structured data, so they limit themselves to the header portions of the packets or connection records. > 2) such IDSs "work" only because of the artifacts in the evaluation datasets We can't really say that conclusively. At this point we can only say that any successes demonstrated by those systems has been due to flaws in the evaluation datasets. For lack of good evaluation datasets, we have no idea how those systems might perform in real world environments. More importantly, for any system which requires training data we must question how portable it is across different networks; should it require unique training data for a given network, is it feasible that such training data will ever be available? I see a lot of people saying (correctly) that advanced (non-signature based) NIDS can't be researched until we have good evaluation datasets, and I see a lot of people ignoring them and doing it anyway. Is anyone (else) actually working on fixing the data problem? Cheers, Terry
