Re: [Bro-Dev] Jira filter results
On Tue, Aug 28, 2018 at 12:48 PM Johanna Amann wrote: > "The filter configured for this gadget could not be retrieved. Please > verify it is still valid on the issue navigator.". Should be showing merge requests again. - Jon ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker data layouts
On Tue, Aug 28, 2018 at 17:12 +0200, Dominik Charousset wrote: > 1) Matthias threw in memory-mapping, but I’m not so sure if this is > actually feasible for you. Yeah, our normal use case is different, memory-mapping won't help much with that. > 2) CAF already does batching. Ideally, Broker should not need to do > any additional batching on top of that. Yep, but (3) was the problem with that: > Do you still remember what showed up during your investigation that > triggered you to go with the blob? Looking back through emails, at some point Jon replaced CAF serialization with these blobs and got substantially better performance. He also had a patch that reproduced the effect with the benchmark tool you wrote. I'm pasting that in below, I'm assuming it still applies. Looks like the conclusion at that time was that it is indeed an issue with the serialization and/or copying the data. > An in-depth performance analysis of Broker’s streaming layer is on my > todo list for months at this point. I hope I get something done before > the Bro Workshop in Europe. That would be great. :) Robin ``` diff --git a/tests/benchmark/broker-stream-benchmark.cc b/tests/benchmark/broker-stream-benchmark.cc index 821ac39..26b0778 100644 --- a/tests/benchmark/broker-stream-benchmark.cc +++ b/tests/benchmark/broker-stream-benchmark.cc @@ -1,6 +1,7 @@ #include #include +#include using std::cout; using std::cerr; @@ -55,8 +56,11 @@ void publish_mode(broker::endpoint& ep, const std::string& topic_str) { // nop }, [=](caf::unit_t&, downstream>& out, size_t num) { - for (size_t i = 0; i < num; ++i) -out.push(std::make_pair(topic_str, "Lorem ipsum dolor sit amet.")); + for (size_t i = 0; i < num; ++i) { +auto ev = broker::bro::Event(std::string("event_1"), + std::vector{42, "test"}); +out.push(std::make_pair(topic_str, std::move(ev))); + } global_count += num; }, [=](const caf::unit_t&) { ``` -- Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
[Bro-Dev] Jira filter results
Hi, when I go to tracker.bro.org, the top-right box (Filter result) for me shows: "The filter configured for this gadget could not be retrieved. Please verify it is still valid on the issue navigator.". This seems to be independent of Browser. I think this used to show the merge-requests. Can someone perhaps fix that again? :) Thanks, Johanna ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] Broker data layouts
>> Okay. In the future, we probably need some form of >> "serialization-free" batching mechanism to ship data more efficiently. > > Do you guys have a sense of how load splits up between serialization > and batching/communication? My hope has been that batching itself can > take care of the performance issues, so that we'll be able to send > logs as standard CAF messages, each one representing a batch of N log > lines. The benchmark I had created a little while ago to examine that > wasn't able to get the necessary performance out of Broker/CAF to do > that (hence the fall-back to Bro's old serialization of log messages > for now, sent over CAF). But iirc, the conclusion was that there's > still room for improvement in CAF that should make this feasible > eventually. However, if you guys believe it's really CAF's > serialization that's the bottle-neck, then we'll need to come up with > something else indeed. I think there are a couple of orthogonal aspects merged together here. Namely, (1) memory-mapping, (2) batching, and (3) performance of CAF's serialization. 1) Matthias threw in memory-mapping, but I’m not so sure if this is actually feasible for you. The main benefit here is to have a unified representation in memory, on disk, and on the wire. I think you’re still going to keep the ASCII log output format for Bro logs. Also, a memory-mapped format would mean to drop the current broker::data API entirely. My hunch is that you would rather not break the API immediately after releasing it to the public. 2) CAF already does batching. Ideally, Broker should not need to do any additional batching on top of that. In fact, doing the batching in user code greatly diminishes effectiveness of CAF’s own batching, because now CAF can no longer break up chunks on its own to make efficient use of resources. 3) Serialization should really not be a bottleneck. The costly part is shuffling bytes around in buffers and heap allocations when deserializing a broker::data. There’s no way around these two costs. Do you still remember what showed up during your investigation that triggered you to go with the blob? Because what I can see as a *much* bigger issue is *copying* overhead, not serialization. CAF streams assume that individual elements are cheap to copy. So probably a copy-on-write optimization for broker::data would have a much higher impact on performance (it’s also straightforward to implement and CAF has re-usable pieces for that). If serialization still shows up with unreasonable costs in a profiler, however, there are ways to speed things up. The customization point here is a specialized inspect() overload for broker::data that essentially allows you apply all optimization you want (and that might be used in Bro’s framework). I hope we’re not talking past each other. :) An in-depth performance analysis of Broker’s streaming layer is on my todo list for months at this point. I hope I get something done before the Bro Workshop in Europe. Then we can hopefully discuss this with some reliable data in person. Dominik ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev