[ 
https://bro-tracker.atlassian.net/browse/BIT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010#comment-16010
 ] 

Aashish Sharma commented on BIT-1140:
-------------------------------------

Matthias, 

I have created two simple test files. Both of these files add a bunch of URL's 
to a bloomfilter. 

Then, scripts do a bloomfilter_lookup on a *different* set of URLs. 

You should notice two problems
1) URLs which aren't even added to the filter show up as in the filter ( 
bloomfilter_lookup returns 1) 
2) Return 1 is inconsistent on multiple runs  (sometimes it shows 0, sometimes 
1) 

The URLs' added are from in smtp extracted URLs while URLs looked up are in 
http stream.  Basically, I am making a bloomfilter for all the URLs extracted 
from emails and then testing against HTTP to see if any of smtp URLs "has been 
clicked".  (Currently I use a table which gives me correct results but with a 
much bigger memory footprint)

With boomfilter, we see quite a bit of false positives. 

Here are two examples: 

1) bloom-test-short.bro  - only does lookup for 4 URLs. on repeated run (bro 
./bloom-test-short.bro ) you should see different outputs on hits (0 - miss, 1 
hit) and the URLs we are looking up aren't added to the filter. 
2) bloom-test2.bro  - Has much more extensive Lookup set. On a run you should 
see the lookup results as 0 or 1 and it varies. Again all the lookup URLs are 
different from the ones added. 

Please let me know if you have problems reproducing this. I can send you the 
actual smtp-embedded-url.bro scripts as well. 




> Bloomfilter hashing problem
> ---------------------------
>
>                 Key: BIT-1140
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1140
>             Project: Bro Issue Tracker
>          Issue Type: Problem
>          Components: Bro
>            Reporter: Robin Sommer
>            Assignee: Matthias Vallentin
>             Fix For: 2.3
>
>         Attachments: bloom-test2.bro, bloom-test-short.bro
>
>
> It seems bloomfilter hashing isn't working correctly. Has that been 
> confirmed? Is there a fix?



--
This message was sent by Atlassian JIRA
(v6.3-OD-01-067#6307)
_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

Reply via email to