Hi all,

I am attaching the document that describes how Spot uses LDA in order to 
perform anomaly detection on network events. I have also received multiple 
questions related to how the ‘user scoring’ (‘feedback’) of particular items in 
the suspicious connects report (in the UI layer) is used in ML. We have not 
provided much detail on this functionality in the attached document. I thought 
I’d put an explanation out there and we can discuss questions related to my 
explanation and discuss what additional info should be included in the attached 
document.

The Spot team feels that changes are needed to this ‘feedback’ functionality, 
and see these changes as happening concurrent with improvements to the ability 
for context from an LDA model trained on a given batch of data to be carried 
forward to the next training run (or even training in a streaming use case). 
The value of ‘feedback’ is dependent on the quality of the model-context we can 
carry over.

The idea for feedback is as follows. The items that are scored with a 1 (i.e. 
the user identifies the item as benign and so does not want to see it in the 
suspicious connects report anymore) will be used for letting the machine 
learning component know that such an entry should not be considered as 
suspicious anymore. Currently this is done by injecting artificial log entries 
into the next batch of data so that LDA sees many such entries and therefore no 
longer sees them as anomalies.

We have ideas for other ways to allow this functionality - for example we could 
filter entries matching the identified pattern from the next batch run BEFORE 
ML runs on the batch. For items that are scored by the user in the UI as ‘3’ 
(for example the user sees an ip as so suspicious that we want to see all 
future log entries associated to that ip) we could filter future items matching 
such a pattern in order to skip ML and instead report them in a separate pane 
of the UI or insert them to the top of the most suspicious events.

Comments, Questions?
Brandon

Reply via email to