faster-val-clone

Jon Siwek (JIRA) Tue, 18 Mar 2014 15:13:08 -0700

    [ 
https://bro-tracker.atlassian.net/browse/BIT-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15808#comment-15808
 ]


Jon Siwek commented on BIT-1161:
--------------------------------

{quote}
The obvious overhead that could be reduced was from the fixed growth 
incrementation of the buffer used to contain serialized data. With records that 
expand out to ~1.6M (master) or ~3M (topic/bernhard/file-analysis-x509) in 
serialized form, it takes a bit too many allocations when trying to get there 
in growth increments of 64K. It may also help some to use realloc instead of 
new/memcpy/delete each time it needs to grow.
{quote}

Note that the benefit of this optimization is more pronounced on Bernhard's 
branch.  And I don't think doubling the size of the serialized data there is 
necessarily something wrong or needs to be fixed/changed.  But it might be 
something to double-check whether some of the redefs of SSL::Info or 
Files::Info can be streamlined.

> topic/jsiwek/faster-val-clone
> -----------------------------
>
>                 Key: BIT-1161
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1161
>             Project: Bro Issue Tracker
>          Issue Type: Improvement
>          Components: Bro
>    Affects Versions: git/master
>            Reporter: Jon Siwek
>             Fix For: 2.3
>
>
> This branch makes it less expensive to serialize large/complex values (e.g. 
> connection and/or fa_file records).
> The obvious overhead that could be reduced was from the fixed growth 
> incrementation of the buffer used to contain serialized data.  With records 
> that expand out to ~1.6M (master) or ~3M (topic/bernhard/file-analysis-x509) 
> in serialized form, it takes a bit too many allocations when trying to get 
> there in growth increments of 64K.  It may also help some to use realloc 
> instead of new/memcpy/delete each time it needs to grow.
> I didn't find it helped much to increase the initial buffer size from 64K 
> (and 90% of the things needing serialization fit in that size buffer anyway).
> It could possibly help to preallocate a buffer that gets re-used across 
> serializations instead of repeatedly allocating small buffers that will need 
> to be resized.
> I don't have a complete breakdown/view of the bytes that make up the 
> serialized version of the large/complex records, but taking a quick look I 
> note that the filenames from Location information of each BroObj/Val make up 
> a third of ~1.6M (master).  And that's the full path of each file, so this 
> all will depend on where the Bro scripts reside on the file system (i.e. put 
> them as close to the root dir as possible and you might increase 
> performance!).
> Any other quick ideas of what can be done here?  If not, improving the 
> serialization seems to deserve its own project (which also might be part of 
> the new comm. library project) for later.
> In the meantime, it's at least shown that avoiding situations where 
> large/complex records are serialized can help (BIT-1139).  And that might 
> always be a useful optimization strategy if the serialized representation of 
> Vals is going to scale not just as a function of their value, but also w/ 
> their type/attribute/location information.



--
This message was sent by Atlassian JIRA
(v6.2-OD-10-004-WN#6253)
_______________________________________________
bro-dev mailing list
[email protected]
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev

[Bro-Dev] [JIRA] (BIT-1161) topic/jsiwek/faster-val-clone

Reply via email to