hi Richard, Next step would be integrating AI (machine learning) with SEC somehow, so > that user won't need to configure correlations statically, but they would > configure and self-optimize automatically. (There still could be some input > needed from the user, but system would be also able to react on changing > log traffic, and self-evolve.) > > Something like ELK+AI has usable in the log monitoring area. > > Maybe some integration with MXNet? > > http://blogs.perl.org/users/sergey_kolychev/2017/02/machine-learning-in-perl.html > > Does anybody have any experience in this area, to explain some more or > less theoretical or practical setup of AI-generated SEC rules? (I am pretty > sure, that this is out of scope of SEC itself, and SEC would'nt know, that > AI is dynamically generating its rules on the background and probably > nobody has working solution, but maybe we could invent something together.) > > Machine learning is a very wide area with a large number of different methods and algorithms around. These methods and algorithms are usually divided into two large classes: *) supervised algorithms which assume that you provide labeled data for learning (for example, a log file where some messages are labeled as "normal" and some messages as "system_fault"), so that the algorithm can learn from labeled examples how to distinguish normal messages from errors (note that in this simplified example, only two labels were used, but in more complex cases you could have more labels in play) *) unsupervised algorithms which are able to distinguish anomalous or abnormal messages without any previous training with labeled data So my first question is -- what is your actual setup and do you have the opportunity of using training data for supervised methods, or are unsupervised methods a better choice? After answering this question, you can start studying most promising methods more closely.
Secondly, what is your actual goal? Do you want to: 1) detect an individual anomalous message or a time frame containing anomalous messages from event logs, 2) produce a warning if the number of messages from specific class (e.g. login failures) per N minutes increases suddenly to an unexpectedly large value, 3) use some tool for (semi)automated mining of new SEC rules, 4) something else? For achieving first goal, there is no silver bullet, but perhaps I can provide few pointers to some relevant research papers (note that there are many other papers in this area): https://ieeexplore.ieee.org/document/4781208 https://ieeexplore.ieee.org/document/7367332 https://dl.acm.org/doi/10.1145/3133956.3134015 For achieving the second goal, you could consider using time series analysis methods. You could begin with a very simple moving average based method like the one described here: https://machinelearnings.co/data-science-tricks-simple-anomaly-detection-for-metrics-with-a-weekly-pattern-2e236970d77 or you could employ more complex forecasting methods (before starting, it is probably a good idea to read this book on forecasting: https://otexts.com/fpp2/) If you want to mine new rules or knowledge for SEC (or for other tools) from event logs, I have actually done some previous research in this domain. Perhaps I can point you to a log mining utility called LogCluster ( https://ristov.github.io/logcluster/) which allows for mining line patterns and outliers from textual events logs. Also, couple of years ago, an experimental system was created which was using LogCluster in a fully automated way for creating SEC Suppress rules, where these rules were essentially matching normal (expected) messages. Any message not matching these rules was considered an anomaly and was logged separately for manual review. Here is the paper that provides an overview of this system: https://ristov.github.io/publications/noms18-log-anomaly-web.pdf Hopefully these pointers will offer you some guidance what your precise research question could be, and what is the most promising avenue for continuing. My apologies if my answer was raising new questions, but machine learning is a very wide area with large number of methods for many different goals. kind regards, risto
_______________________________________________ Simple-evcorr-users mailing list Simple-evcorr-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users