Re: [Bro-Dev] Jira filter results

2018-08-28 Thread Jon Siwek
On Tue, Aug 28, 2018 at 12:48 PM Johanna Amann  wrote:

> "The filter configured for this gadget could not be retrieved. Please
> verify it is still valid on the issue navigator.".

Should be showing merge requests again.

- Jon
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-28 Thread Robin Sommer


On Tue, Aug 28, 2018 at 17:12 +0200, Dominik Charousset wrote:

> 1) Matthias threw in memory-mapping, but I’m not so sure if this is
> actually feasible for you.

Yeah, our normal use case is different, memory-mapping won't help much
with that.

> 2) CAF already does batching. Ideally, Broker should not need to do
> any additional batching on top of that.

Yep, but (3) was the problem with that:

> Do you still remember what showed up during your investigation that
> triggered you to go with the blob?

Looking back through emails, at some point Jon replaced CAF
serialization with these blobs and got substantially better
performance. He also had a patch that reproduced the effect with the
benchmark tool you wrote. I'm pasting that in below, I'm assuming it
still applies. Looks like the conclusion at that time was that it is
indeed an issue with the serialization and/or copying the data.

> An in-depth performance analysis of Broker’s streaming layer is on my
> todo list for months at this point. I hope I get something done before
> the Bro Workshop in Europe.

That would be great. :)

Robin

```
diff --git a/tests/benchmark/broker-stream-benchmark.cc
b/tests/benchmark/broker-stream-benchmark.cc
index 821ac39..26b0778 100644
--- a/tests/benchmark/broker-stream-benchmark.cc
+++ b/tests/benchmark/broker-stream-benchmark.cc
@@ -1,6 +1,7 @@
 #include 

 #include 
+#include 

 using std::cout;
 using std::cerr;
@@ -55,8 +56,11 @@ void publish_mode(broker::endpoint& ep, const std::string&
topic_str) {
   // nop
 },
 [=](caf::unit_t&, downstream>& out, size_t num) {
-  for (size_t i = 0; i < num; ++i)
-out.push(std::make_pair(topic_str, "Lorem ipsum dolor sit amet."));
+  for (size_t i = 0; i < num; ++i) {
+auto ev = broker::bro::Event(std::string("event_1"),
+ std::vector{42, "test"});
+out.push(std::make_pair(topic_str, std::move(ev)));
+  }
   global_count += num;
 },
 [=](const caf::unit_t&) {
```

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] Jira filter results

2018-08-28 Thread Johanna Amann
Hi,

when I go to tracker.bro.org, the top-right box (Filter result) for me
shows:

"The filter configured for this gadget could not be retrieved. Please
verify it is still valid on the issue navigator.". This seems to be
independent of Browser. I think this used to show the merge-requests.

Can someone perhaps fix that again? :)

Thanks,
 Johanna
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-28 Thread Dominik Charousset
>> Okay. In the future, we probably need some form of
>> "serialization-free" batching mechanism to ship data more efficiently.
> 
> Do you guys have a sense of how load splits up between serialization
> and batching/communication? My hope has been that batching itself can
> take care of the performance issues, so that we'll be able to send
> logs as standard CAF messages, each one representing a batch of N log
> lines. The benchmark I had created a little while ago to examine that
> wasn't able to get the necessary performance out of Broker/CAF to do
> that (hence the fall-back to Bro's old serialization of log messages
> for now, sent over CAF). But iirc, the conclusion was that there's
> still room for improvement in CAF that should make this feasible
> eventually. However, if you guys believe it's really CAF's
> serialization that's the bottle-neck, then we'll need to come up with
> something else indeed.

I think there are a couple of orthogonal aspects merged together here. Namely, 
(1) memory-mapping, (2) batching, and (3) performance of CAF's serialization.

1) Matthias threw in memory-mapping, but I’m not so sure if this is actually 
feasible for you. The main benefit here is to have a unified representation in 
memory, on disk, and on the wire. I think you’re still going to keep the ASCII 
log output format for Bro logs. Also, a memory-mapped format would mean to drop 
the current broker::data API entirely. My hunch is that you would rather not 
break the API immediately after releasing it to the public.

2) CAF already does batching. Ideally, Broker should not need to do any 
additional batching on top of that. In fact, doing the batching in user code 
greatly diminishes effectiveness of CAF’s own batching, because now CAF can no 
longer break up chunks on its own to make efficient use of resources.

3) Serialization should really not be a bottleneck. The costly part is 
shuffling bytes around in buffers and heap allocations when deserializing a 
broker::data. There’s no way around these two costs. Do you still remember what 
showed up during your investigation that triggered you to go with the blob? 
Because what I can see as a *much* bigger issue is *copying* overhead, not 
serialization. CAF streams assume that individual elements are cheap to copy. 
So probably a copy-on-write optimization for broker::data would have a much 
higher impact on performance (it’s also straightforward to implement and CAF 
has re-usable pieces for that). If serialization still shows up with 
unreasonable costs in a profiler, however, there are ways to speed things up. 
The customization point here is a specialized inspect() overload for 
broker::data that essentially allows you apply all optimization you want (and 
that might be used in Bro’s framework).

I hope we’re not talking past each other. :)

An in-depth performance analysis of Broker’s streaming layer is on my todo list 
for months at this point. I hope I get something done before the Bro Workshop 
in Europe. Then we can hopefully discuss this with some reliable data in person.

Dominik
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev