[ 
https://bro-tracker.atlassian.net/browse/BIT-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18215#comment-18215
 ] 

Jimmy Jones commented on BIT-1257:
----------------------------------

Sorry I've not been as clear as I could here. I've changed my own bro instance, 
but I'm concerned that out of the box, Bro's behaviour while convenient for the 
majority of cases, isn't correct and will result in irrecoverably corrupted 
files in some instances (unless you’re lucky enough to keep full captures).

I've researched this further and I would argue there is a right answer and the 
spec is clear, see RFC2616, 10.2.7:

bq. A cache MUST NOT combine a 206 response with other previously cached 
content if the ETag or Last-Modified headers do not match exactly, see 13.5.4.

I'd say Bro is a cache in this instance, and for example clients like IE follow 
this 
[behavior|http://blogs.msdn.com/b/ieinternals/archive/2011/06/03/send-an-etag-to-enable-http-206-file-download-resume-without-restarting.aspx]
 and Adobe Reader uses the If-Range conditional to ensure the URL is the same 
document.

I agree my change is over-conservative, would you accept something that include 
ETag and Last-Modified in the hash? Or is the (small) chance of corruption not 
a concern (which is fine, as long as someone has actively decided not to follow 
the RFC)


> Same file id generated for potentially different files
> ------------------------------------------------------
>
>                 Key: BIT-1257
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1257
>             Project: Bro Issue Tracker
>          Issue Type: Problem
>          Components: Bro
>    Affects Versions: git/master, 2.3
>         Environment: CentOS 6
>            Reporter: Jimmy Jones
>         Attachments: fa.bro, sample-samefileid.pcap
>
>
> Attached sample contains two HTTP downloads of the same URL from the same 
> client, but there are no guarantees that the files is actually the same (no 
> Etags etc - in this case it actually is the same, but lets pretend they were 
> different...). However the file analysis framework seems to give the same 
> file ID in file_name and file_chunk for both downloads.
> Think this is something to do with Range requests as doesn't happen if do 
> "normal" HTTP requests.



--
This message was sent by Atlassian JIRA
(v6.4-OD-05-009#64003)

_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

Reply via email to