Re: [Bro-Dev] Building bro 2.6 with static broker/caf libraries

2018-12-06 Thread Robin Sommer



On Wed, Dec 05, 2018 at 19:03 -0800, Craig Leres wrote:

> (I'm working on updating the FreeBSD port to 2.6 and can't install 
> things like libcaf_io.so in /usr/local/lib because they conflict with 
> libraries potentially installed by the devel/caf port.)

What's the version of the CAF port? If it's recent, Bro should be able
to link against that.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Consistent error handling for scripting mistakes

2018-11-12 Thread Robin Sommer



On Mon, Nov 12, 2018 at 12:27 -0600, Jonathan Siwek wrote:

> I recently noticed there's a range of behaviors in how various
> scripting mistakes are treated.

There's a 4th: InterpreterException.

> 1st question: should these be made more consistent? I'd say yes.

Yes, definitely.

> that it's only the *current function body* (yes, *function*, not
> event) that exits early -- hard to reason about what sort of arbitrary
> code was depending on that function to be fully evaluated and what
> other sort of inconsistent state is caused by exiting early.

... and what happens if the function is supposed to return a value,
but now doesn't?

> I propose, for 2.7, to aim for consistent error handling for scripting
> mistakes and that the expected behavior is to unwind all the way to
> exiting the current event handler (all its function bodies).

Agree with that. Can we do that cleanly though? The problem with
InterpreterException is that it may leak memory. We'd need to do the
unwinding manually throughout the interpreter code, but that means
touching a number of places to pass the error information back.

> One exception may be within bro_init(), if an error happens at that
> time, I'd say it's fine to completely abort -- it's unlikely or hard
> to say whether Bro would operate well if it proceeded after an error
> that early in initialization.

Yeah, that makes sense.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] attributes & named types

2018-11-05 Thread Robin Sommer



On Sat, Nov 03, 2018 at 21:58 +, Vlad Grigorescu wrote:

> In my mind, if the keyword is applied to a record, I would expect any new
> fields added to that record to also be logged.

I believe the reason for not doing that is that then one couldn't add
a field that's *not* being logged (because currently we don't have
remove-an-attribute support).

I like the "=T|F" syntax to control this more directly, as long as
"" remains being equivalent to "=T".

Generally we need to be very careful changing if we want to change any
current semantics here, as it will impact custom log files that people
create in their own scripts.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Config Framework Feedback

2018-11-01 Thread Robin Sommer
The oberservations / thoughts in this thread seem worth a ticket I'd
say. We can refine this over time if the current semantics aren't
quite ideal yet.

Robin

On Tue, Oct 30, 2018 at 13:17 -0700, Christian Kreibich wrote:

> Hi folks,
> 
> I would agree that it takes a bit of experimentation to figure out 
> exactly when a change handler fires and how to reliably initialize or 
> update things based on an option's value.
> 
> Consider this:
> 
>module Foo;
> 
>export { option foo = F; }
> 
>function foo_handler(ID: string, foo_new: bool): bool
>{
>print fmt("New foo: %s", foo_new);
> 
># Update stuff here based on foo's value
># ...
> 
>return foo_new;
>}
> 
>event bro_init() {
>Option::set_change_handler("Foo::foo", foo_handler);
>}
> 
> ... foo_handler doesn't get called when you simply run the script 
> without redefing Config::config_files. When you do redef it, the handler 
> fires both when the config file sets foo to T, and when it sets it to F.
> 
> So you have to make sure that your initialization happens even when the 
> handler doesn't get called, and you cannot write your handler assuming 
> that the new value is actually different from the old one.
> 
> These arguably aren't bugs, but imo they do take getting used to.
> 
> Best,
> -C.
> _______
> bro-dev mailing list
> bro-dev@bro.org
> http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] consistency checking for attributes

2018-10-31 Thread Robin Sommer
On Mon, Oct 29, 2018 at 11:49 -0700, Vern Paxson wrote:

> I'm planning to add basic consistency checking, which will look for
> (1) attributes that are repeated (which doesn't appear to be meaningful for
> any of them) and (2) attributes that don't make sense in a given context,
> like the ones listed above.

Sounds good to me.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] JIRA to GitHub ticket migration plan

2018-09-14 Thread Robin Sommer



On Fri, Sep 14, 2018 at 13:45 -0500, Jonathan Siwek wrote:

> Anything else to worry about?

Are Jenkins and Coverity already pulling from GitHub?

I don't know if there's anything we can do on the old server to make
existing clones deal with the relocation more gracefully. I don't
wthink there's a way just redirect a git client, but maybe we could get
some error message into the output or something? Not sure.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] JIRA to GitHub ticket migration plan

2018-09-14 Thread Robin Sommer



On Fri, Sep 14, 2018 at 11:36 -0500, Jonathan Siwek wrote:

> I did some label tweaking and reduced some prefix names: "Component"
> -> "Area" and "Difficulty" -> "Pain".

Ok, thanks, makes sense.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] JIRA to GitHub ticket migration plan

2018-09-14 Thread Robin Sommer
On Thu, Sep 13, 2018 at 19:39 -0500, Jon Siwek wrote:

> A preview of what migrated issues will look like along with new labeling 
> scheme:

Looks great, nice job. The only thing I noticed is that the labels are
quite long, making the list of tickets appear somewhat crowded. Could
we skip the prefixes ("Type:", "Component:") and instead use colors to
encode them? So, say, all types would be green, all components yellow
(which they already are), etc.

> Remaining tasks:

We are leaving switching to github as authoritative source for the
repositories to later, right? Doing it all at the same time could
avoid confusion ("everything is on github now" is an easier
statement), but would also make the process more complex. Maybe the
real question here is if we want to switch repositories before or
after 2.6?

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-28 Thread Robin Sommer


On Tue, Aug 28, 2018 at 17:12 +0200, Dominik Charousset wrote:

> 1) Matthias threw in memory-mapping, but I’m not so sure if this is
> actually feasible for you.

Yeah, our normal use case is different, memory-mapping won't help much
with that.

> 2) CAF already does batching. Ideally, Broker should not need to do
> any additional batching on top of that.

Yep, but (3) was the problem with that:

> Do you still remember what showed up during your investigation that
> triggered you to go with the blob?

Looking back through emails, at some point Jon replaced CAF
serialization with these blobs and got substantially better
performance. He also had a patch that reproduced the effect with the
benchmark tool you wrote. I'm pasting that in below, I'm assuming it
still applies. Looks like the conclusion at that time was that it is
indeed an issue with the serialization and/or copying the data.

> An in-depth performance analysis of Broker’s streaming layer is on my
> todo list for months at this point. I hope I get something done before
> the Bro Workshop in Europe.

That would be great. :)

Robin

```
diff --git a/tests/benchmark/broker-stream-benchmark.cc
b/tests/benchmark/broker-stream-benchmark.cc
index 821ac39..26b0778 100644
--- a/tests/benchmark/broker-stream-benchmark.cc
+++ b/tests/benchmark/broker-stream-benchmark.cc
@@ -1,6 +1,7 @@
 #include 

 #include 
+#include 

 using std::cout;
 using std::cerr;
@@ -55,8 +56,11 @@ void publish_mode(broker::endpoint& ep, const std::string&
topic_str) {
   // nop
 },
 [=](caf::unit_t&, downstream>& out, size_t num) {
-  for (size_t i = 0; i < num; ++i)
-out.push(std::make_pair(topic_str, "Lorem ipsum dolor sit amet."));
+  for (size_t i = 0; i < num; ++i) {
+auto ev = broker::bro::Event(std::string("event_1"),
+ std::vector{42, "test"});
+out.push(std::make_pair(topic_str, std::move(ev)));
+  }
   global_count += num;
 },
 [=](const caf::unit_t&) {
```

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-27 Thread Robin Sommer



On Sat, Aug 25, 2018 at 17:42 +0200, Matthias Vallentin wrote:

> Okay. In the future, we probably need some form of
> "serialization-free" batching mechanism to ship data more efficiently.

Do you guys have a sense of how load splits up between serialization
and batching/communication? My hope has been that batching itself can
take care of the performance issues, so that we'll be able to send
logs as standard CAF messages, each one representing a batch of N log
lines. The benchmark I had created a little while ago to examine that
wasn't able to get the necessary performance out of Broker/CAF to do
that (hence the fall-back to Bro's old serialization of log messages
for now, sent over CAF). But iirc, the conclusion was that there's
still room for improvement in CAF that should make this feasible
eventually. However, if you guys believe it's really CAF's
serialization that's the bottle-neck, then we'll need to come up with
something else indeed.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-24 Thread Robin Sommer



On Fri, Aug 24, 2018 at 16:32 +0200, Matthias Vallentin wrote:

> It sounds like this is critical also for regular operation:

Agree. Right now a newly connecting peer gets a round of explicit
LogCreates, but that's probably not the best way forward for larger
topologies.

> is it currently impossible to parse Bro logs with Broker, because all
> logs come in the LogWrite message, wich is a binary blob?

Correct. (This was different at first, but the switch was necessary
for performance. It's waiting for a better solution at this point.)

> In other words, can Broker currently be used if one writes a Bro
> script that publishes plain events (message type 1 in bro.hh)?

Yes to that. Non-Bros can exchange events (assuming they know the
schema), but not logs.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-23 Thread Robin Sommer



On Thu, Aug 23, 2018 at 10:01 -0500, Jonathan Siwek wrote:

> Yeah, that's one problem, but a bigger issue is you can't parse
> LogWrite because the content is a serial blob whose format is another
> thing not intended for public consumption.

I guess my earlier comment might have been misleading: there's
certaily work that needs to be done to open this up. Right now, it's
probably not even realistic at all because we still have a work around
in place in there that uses the old (non-Broker) serialization code
for creating that blob. That was to get around a performance issue,
and still needs to be addressed. As part of upgrading that, I think it
can make sense to think about documenting the format we end up
chosing.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-23 Thread Robin Sommer



On Thu, Aug 23, 2018 at 15:32 +0200, Dominik Charousset wrote:

> Does that mean I need to receive the LogCreate even first to
> understand successive LogWrite events?

I don't really see a way around that without substantially increasing
volume. We could send LogCreate updates regularly, so that it's easier
to synchronize with an ongoing stream.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-22 Thread Robin Sommer



On Tue, Aug 21, 2018 at 14:05 -0500, Jonathan Siwek wrote:

> Though the Broker data corresponding to log entry content is also
> opaque at the moment (I recall that was maybe for performance or
> message volume optimization),

Yeah, but generally this is something I could see opening up. The log
structure is pretty straight-forward and self-describing, it'd be
mostly a matter of clean up and documentation to make that directly
accessible to external consumers I think. Events, on the other hands,
are semantically tied very closely to the scripts generating them, and
also much more diverse so that self-description doesn't really seem
feasible/useful. Republishing a relevant subset certainly sounds
better for that; or, if it's really a bulk feed that's desired, some
out-of-band mechanism to convey the schema information somehow.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker data layouts

2018-08-21 Thread Robin Sommer



On Tue, Aug 21, 2018 at 12:34 -0500, Jonathan Siwek wrote:

> Maybe there's a more standardized approach that could be worked
> towards, but likely we just need more experience in understanding and
> defining common use-cases for external Bro data consumption.

Dominik, wasn't the original idea for VAST to provide an event
description language that would create the link between the values
coming over the wire and their interpretation? Such a specification
could be auto-generated from Bro's knowledge about the events it
generates.

Also, this question is about events, not logs, right? Logs have a
different wire format and they actually come with meta data describing
their columns.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] [Administrativa] Mailing list archives

2018-08-15 Thread Robin Sommer
Quick reminder: Please keep in mind that mails to the Bro mailing
lists are archived publically. We had a couple of cases recently where
internal information went to a list, ending up in the archive, where
it's difficult to remove from. Thanks,

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-14 Thread Robin Sommer



On Tue, Aug 14, 2018 at 10:51 -0500, Jonathan Siwek wrote:

> Not sure, is Broker::auto_publish() currently able to do the same thing?

Hmm .. Good point. Scope is different between the two (event vs topic)
but the effect is similar in the end.

> I can also see the opposite being intuitive: If I told
> Broker::subscribe() to raise locally, then I get just always use
> Broker::publish() and not think about the difference between using
> "event" versus "publish".  Would Broker::auto_publish() be removable
> then?

I would like to say "yes" (because I like the subscribe() approach
better than auto_publish() :-), but would that work well with our
cluster topics? If we didn't have the event-specific auto_publish(),
we would have to turn on local raise for *all* events going to, e.g.,
bro/cluster/worker. And thinking about it, maybe that's in fact also
an argument against my original thinking how this could help unify
scripts --- well, unless we'd go with Jan's thought of subtopics
(e.g., subscribe("bro/cluster/worker/intel", local_raise=T).

Robin


-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-14 Thread Robin Sommer



On Mon, Aug 13, 2018 at 13:55 -0500, Jonathan Siwek wrote:

> associating node IDs with subscription state and also message state
> (push node IDs into messages upon receipt before forwarding),

Yeah, that sounds like the right direction. Some reading might be
worthwile doing here, there are quite a few papers out there on
routing in overlay networks.

> (1) Remove relay(...) functions
> (2) Reduce unique topic names (use pre-existing cluster topics where possible)
> (3) Add Broker::forward(topic_prefix) function + enable Broker forwarding

Yes, that sounds good to me, plus whatever that means for "publish()"
itself. I like what we have arrived at here.

One more question: what about raising published events locally as well
if the sending node is subscribed to the topic? I'm kind of torn on
that. I don't think we want that as a default, but perhaps as an
option, either with the publish() call or, likely better, with the
subscribe() call? I can see that being helpful in cases like unifying
standalone vs cluster operation; and more generally, for running
multiple node types inside the same Bro instance.

> An alternative to (3) would be implementing "real" routing in Broker
> right from the start.

In an ideal world, yes, that would certainly be nice to have. But it's
a larger task that I don't think we would be able to finish for 2.6
anymore. So, I'd put that on the list for later.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Robin Sommer



On Fri, Aug 10, 2018 at 10:24 -0500, Jonathan Siwek wrote:

> Or is it a matter of "if a user needed it for something, then it's
> available" ?

Yeah, including matching expectations: if there's a
"bro/cluster/worker" topic, I'd expect I can publish there to reach
all the workers (from anywhere). However, I think I'm with you now
that maybe we just shouldn't do do/support any forwarding in the
cluster right now. Pools and manual relaying are a (currently better)
alternative, and we can change things later. And at least it's a clear
message: no forwarding across cluster nodes.

> However, I can see Broker::forward() could make it a bit easier for a
> user wanting to manually set up a forwarding route between clusters or
> other external applications.  Is that a clear use-case we need to
> cater to now?

Well, if it were easy to add the forward() function, that could indeed
be quite useful for external integrations still. With that, one could
selectively forward custom topics (at one's own risk), without causing
a mess for the cluster. I'm thinking osquery integration for example,
where messages might go through an intermediary Bro. One advantage
that Broker-internal forwarding has compared to manual relaying is
that messages won't be propagated back to the sender.

But it's a matter of effort at this point I'd say.

> RR via proxy is not just load-balancing either, but fault-tolerance as
> well.

Yeah, that's right.

> But here you're talking more about removing the relay() functions and
> doing the RR-via-proxy "manually", right?  That seems ok to me -- once
> "real" routing is available, you then have the option to simplify your
> script and get a minor optimization by not having to manually
> handle+forward the event on proxies.

Ok, let's make that change then, I think removing relay() will help
for sure making the API easier.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-10 Thread Robin Sommer



On Fri, Aug 10, 2018 at 15:22 +0200, Jan Grashöfer wrote:

> different purposes. If that is still a design goal, it feels like the
> structure of a cluster could be more volatile than it used to be.

It is, and we have some of that, and I think it fits in with the
discussion here too. In my mind, I see two separate things in this
discussion: one is a general Broker API that facilitates some very
different applications; and the 2nd is our cluster framework that uses
that API for a specific use-case. The latter is much easier to tune
for us in terms of how it uses Broker, as we can hide much of it
internally and adjust later, i.e., by adding a new node type. The
question for the cluster framework, then, is what API *it* provides
for scripts to share state in a cluster. And a part of the answer to
that could be "standardized topics" that are guaranteed to get the
information to where it needs to go.

> Maybe a silly question: Would that work using further "specialized" topics
> like bro/cluster/worker/intel? From my understanding one feature of topics
> is that one would be able to subscribe only the the things that one is
> interested in. Having a bunch of events just published to bro/cluster/worker
> seems counterproductive.

I hear you, but I think I haven't quite understood the concern yet.
Can you give me an example where the difference matters? What's
different between publishing intel events to bro/cluster/worker/intel
vs bro/cluster/worker if both go to all workers? Or is it so that some
workers can decide not to receive the intel events?

(And technically, subscriptions are prefixed based, so anybody
subscribing to bro/cluster/worker automatically gets
bro/cluster/worker/intel as well; not sure if that helps or hurts
here?)

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-09 Thread Robin Sommer
Yeah, and let me add one thing: What if as a starting point for
modeling things, we assumed that we have global topic-based routing
available. Meaning if node A publishes to topic X, the message will
show up at all nodes that are subscribed to topic X anywhere, no
matter what the topology --- Broker will somehow take care of that. I
believe that's where we want to get eventually, through whatever
mechanism; it's not trivial, but also not rocket science.

Then we (A) design the API from that perspective and adapt our
standard scripts accoordingly, and (B) see how we can get an
approximation of that assumption for today's Broker and our simple
clusters, by having the cluster framework hardcode what need.

> (1) enable the "explicit/manual" forwarding by default?

Coming from that assumption above, I'd say yes here, doing it like you
suggest: differentiate between forwarding and locally raising an event
by topic. Maybe instead of adding it to Broker::subscribe() as a
boolean, we add a separate "Broker::forward(topic_prefix)" function,
and use that to essentially hardcode forwarding on each node just like
we want/need for the cluster. Behind the scenes Broker could still
just store the information as a boolean, but API-wise it means we can
later (once we have real routing) just rip out the forward() calls and
let Magic take its role. :)

As you say, we don't get load-balancing that way (today), but we still
have pools for distributing analyses (like the known-* scripts do).
And if distributing message load (like the Intel scripts do) is
necessary, I think pools can solve that as well: we could use a RR
proxy pool and funnel it through script-land there: send to one proxy
and have an event handler there that triggers a new event to publish
it back out to the workers. For proxies, that kind of additional load
should be fine (if load-balancing is even necessary at all; just going
through a single forwarding node might just as well be fine.

> (2) re-implement any existing subscription cycles?

Now, here I'm starting to change my mind a bit. Maybe in the end, in
large topologies, it would be futile to insist on not having cycles
after all. The assumption above doesn't care about it, putting Broker
in charge of figuring it out. So with that, if we can set up
forwarding through (1) in a way that cycles in subscriptions don't
matter, it may be fine to just leave them in. But I guess in the end
it doesn't matter, removing them can only make things better/easier.

> Also maybe begs the question for later regarding the "real" routing
> mechanism: I suppose that would need to be smart enough to do
> automatic load-balancing in the case of there being more than one
> route to a subscriber.

Yeah, I'm becoming more and more convinced that in the end we won't
get around adding a "real" routing layer that takes of such things
under the hood.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-08 Thread Robin Sommer



On Wed, Aug 08, 2018 at 12:36 -0500, Jonathan Siwek wrote:

> * publish() API simplifications/compressions (pending decision on
> exactly what those should be)

Yeah, with an eye on the semantics for forwarding (now and later),
and whether to raise published events locally as well if the host is
subscribed itself.

And maybe the 2nd eye on: can define these semantics so that we can
get rid of some of the "what node type am I?" checks? I'm not sure how
that would look like, but generally it would be nice if one could just
publish stuff liberally without worrying too much and the
subscriptions and forwarding semantics do the right thing (not always,
but often)).

> * enable message forwarding by default (meaning re-implement the one
> or two subscription patterns that might create a cycle)

Haven't quite made up my mind on this one. In principlel yes, but
right now a host needs to be subscribed to a topic to forward it if I
remember than right. That may limit how we use topics, not sure (e.g.,
if a worker wanted to talk to other workers, with "real"
forwarding/routing they'd just publish to the worker topic and that
message would get routed there, but not be processed at the
intermediary hops as well. With our current forwarding, the hops would
need to subscribe to the worker topic as well and hence the event got
raised there, too.)

> * see if any script-specific topics can instead use a pre-existing
> "cluster" topic

Yep.

> difficult due to having to hunt down things in various scripts and
> whether a more centralized config could be something to do?

Yeah, that sounds useful for the cluster case: it could be part of the
cluster framework to define all the relevant node types with their
characeristics. That would also make later changes easier &
centralized to how topics and connections are set up.

For other use cases, it should still be possible to configure things
independently, too, though (say, for talking to external Broker
applications).

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-08 Thread Robin Sommer



On Wed, Aug 08, 2018 at 14:20 +, Justin Azoff wrote:

> There's also a bunch of places that I think were written standalone first and 
> then updated to work on a cluster in
> place resulting in some awkwardness..

Yeah, indeed, that's another other source of complexity with these
scripts.

> But if this was written in a more 'cluster by default' way, it would just 
> look like:

Nice example. That's the kind of thing I hope we can do during the
next cycle: streamline the scripts to unify these kinds of logic.

> Broker::publish could possibly be optimized for standalone to raise the event 
> directly if not being ran in a cluster.

Or we generally raise published events locally as well if the node is
subscribed to the destination topic. There are pros and cons for that
I think.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-08 Thread Robin Sommer
Yeah, I realize that. A direct port of the old logic was of course the
goal so far, with the drawbacks of that approach accepted &
understood. That's what's in place now; that's great and exactly as
planned. We can get 2.6 out this way, and it'll be fine.

My point is that now also seems like a good time to take stock of what
we got that way. That direct porting is finally getting us some sense
of where things aren't an ideal match between API and use cases yet.
And if there's something easy we can do about that before people start
relying on the new API, it seems that would be beneficial to do. But
we can see.

Robin

On Tue, Aug 07, 2018 at 13:39 -0500, Jonathan Siwek wrote:

> How much is due to new API usage and how much is due to things mainly
> being a direct port of old communication patterns (which I guess are
> written by various people over extended lengths of time and so there's
> inconsistencies to be expected) ?  Or due to being a mishmash of both
> old and new?


-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-08 Thread Robin Sommer


On Tue, Aug 07, 2018 at 12:05 +0200, Jan Grashöfer wrote:

> What I can recall, it's about simplifying the API in the light of
> multi-hop routing, which is not fully functional yet.

To level up a bit, what I'm hoping for is that we can find some easy
ways to simplify the API a bit more now, with an eye towards dynamic
multi-hop coming later. I don't know if it'll work out before 2.6
still, but changing the API later is more painful.

We don't need to (or even can) solve multi-hop topologies right now, I
think nobody really has the use cases clear in their heads yet. But if
we could simplify the API a bit more for our current use cases in a
way that may extend to multihop naturally later, that would probably
save us some headaches at that point.

> How does forwarding work if I add another node type?

That's actually something I realized yesterday: we don't have direct
worker-to-worker communication right now, correct? A worker cannot
just publish to "bro/cluster/workers".

> Do we assume a certain cluster structure here? If yes: Is that a valid
> assumption?

I think it's safe to assume we have the cluster structure under our
own control; it's whatever we configure it to be. That's something
that's easier to change later than the API itself. Said differently:
we can always adjust the connections and topics that we set up by
default; it's much harder to change how the publish() function works.
 
> From my understanding this would mean going back to the old 
> communication patterns. What's the point of having topics if we don't 
> use them?

Let me try to phrase it differently: If there's already a topic for a
use case, it's better to use it. That's easier and less error-prone.
So if, e.g., I want to send my script's data to all workers,
publishing to bro/cluster/worker will do the job. And that will even
automatically adapt if things get more complex later. For example, I
can see having multiple otherwise independent cluster sharing a
communication channel. In that case, we could internally change the
topic to "bro/cluster//workers", and everybody using the
predefined worker topic would still reach "their" workers without any
further changes.

> That's something I would have expected. I don't think this is 
> necessarily an indicator of bad design.

Maybe it's a *necessary* design, but that doesn't make it nice. ;-) It
makes it very hard to follow the logic; when reading through the
scripts I got lost multiple times because some "@if I-am-a-manager"
was somewhere half a page earlier, disabling the code I was currently
looking at for most nodes. We probably can't totally avoid that, but
the less the better.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-06 Thread Robin Sommer


On Mon, Jul 30, 2018 at 09:01 -0700, I wrote:

> Is there a summary somewhere of what events & topics the cluster nodes
> are currently exchanging?

So I went through the exercise of collecting this information: what
connections do we have between nodes, who's subscribing to what, and
who's publishing what; see the attached PDF. This is based on all the
standard scripts, with some special cases ignored (like the control
framework).

I'm not fully sure yet what to conclude from this, but a few quick
observations:

- The main topics are bro/cluster/ and
  bro/cluster/node/. For these we wouldn't have a problem
  with loops if we enabled automatic, topic-driven forwading as
  far as I can see.

- bro/cluster/broadcast seems to be the main case with a looping
  problem, because everybody subscribes to it. It's hardly used
  though. (bro/config/change is used similarly though).

- Relaying is hardly used.

- There are a couple of script-specific topics where I'm wondering
  if these could switch to using bro/cluster/ instead
  (bro/intel/*, bro/irc/dcc_transfer_update). In other words: when
  clusterizing scripts, prefer not to introduce new topics.

- There's a lot of checks in publishing code of the type "if I am
  (not) of node type X".

- Pools are used for two different things: 1. the known-* scripts
  pick a proxy to process and log the information; whereas 2. the
  Intel scripts pick a proxy just as a relay to broadcast stuff
  out, reducing load. That 1st application is a good, but the 2nd
  feels like should be handled differently.

Need to mull over this more, thoughts welcome.

Overall I have to say I found it pretty hard to follow this all
because we don't have much consistency right now in how scripts
structure their communication. That's not surprising, given that we're
just starting to use all this, but it suggests that we have room for
improvement in our abstractions. :)

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com


Broker Communication.pdf
Description: Adobe PDF document
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-06 Thread Robin Sommer



On Fri, Aug 03, 2018 at 15:57 -0500, Jonathan Siwek wrote:

> Another use is hidden within Cluster::relay_rr():

Yeah, though at least from an API perspective this is different: The
caller gives relay_rr() only one topic to send to (indicator_topic).
It's then using a different topic internally to get it over to the
proxy first, but that feels more like an implementation detail. So in
that sense I would argue that this is not a use-case for the Broker
API letting users change the topic on relay. (I'm not saying that that
capability can't be useful, I'm just still looking for actual use
cases.)

I have another question about this specific case: we use relay_rr()
only for sending Intel::insert_indicator. Intel::remove_indicator gets
published normally through auto_publish(). Why the difference?

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-08-03 Thread Robin Sommer



On Fri, Jul 27, 2018 at 10:39 -0700, I wrote:

> Broker::relay(change_topic, change_topic, Config::cluster_set_option, ID, 
> val, location);

Can somebody remind me what the use-case is for changing the topic on
relay? Grepping over our standard scripts, I see only one use of
relay(), and that's the one above.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-07-30 Thread Robin Sommer



On Mon, Jul 30, 2018 at 13:30 -0500, Jonathan Siwek wrote:

> Seems clunky and could get dicey

Agreed. :) It'd just be a heuristic to catch some obvious errors. I
don't think there's more we can do, we can't really catch loops
statically by looking at the code.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-07-30 Thread Robin Sommer



On Mon, Jul 30, 2018 at 11:15 -0500, Jonathan Siwek wrote:

> I don't see why not, but it takes planning and prudence on everyone's
> part (including users) to not break that rule.

Yeah, question is we can pre-configure the cluster so that user's
don't need to worry about it most of the time.

> I'd be more comfortable if one could automate answering the question:
> "if I add a subscription to a given node in the network, will I create
> a cycle?".

Hmm ... What about a test mode where we'd spin up a dummy cluster
(similar to what the bests do), have each node send a message to all
subscribed topics, and watch for TTL violations?

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker::publish API

2018-07-30 Thread Robin Sommer
On Fri, Jul 27, 2018 at 14:47 -0500, Jonathan Siwek wrote:

> Broker does not yet have automatic multihop where subscriptions are
> globally flooded automatically.

Yep, that's what I meant: dynamic multihop where each node tracks what
its peers are subscribing to, and forwards messages independent of its
own subscriptions.

> Possibly a downside is now you need to store original hop limit in
> addition to current TTL in each message if you want to detect the "is
> 1st hop" condition for the "relay_topic" option below.

Yeah, that's right. Actually I think ideally the 1st hop wouldn't have
any special role anyways if we didn't need that "relay_topic".

> It's maybe both a concern and a reality -- Bro clusters currently
> contain cycles (e.g. worker -> manager -> proxy -> worker)

True, although it's not cycles in the connection topology that matter,
it's cycles in topic subscriptions. I need to think about this a bit
more (and I need to remind myself how our topics currently look like)
but could we set up topics so that even in a cluster, messages don't
go into a cycle?

Is there a summary somewhere of what events & topics the cluster nodes
are currently exchanging?

> > - Add a second function publish_pool() that has all the same
> >   options, but receives a pool type instead of a topic (just an
> >   enum: RR, HRW).
> 
> What's the goal of the enums instead of just publish_hrw() and publish_rr() ?

Similar to what Justin wrote, it would more directly express the
intent, with less emphasis on the mechanism; we could set a
default to whatever we recommend people normally use; and it'd be more
extensible.

> At the moment, one could write their own wrapper function around that
> if they find it too verbose and always want to use certain defaults?

They could, but my general point is that it'd be nice to have a simple
API that covers the most common uses cases directly and intuitively.
Then let people change defaults if they have to and know what they are
doing.

Robin



-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] Broker::publish API

2018-07-27 Thread Robin Sommer
The other day when merging Johanna's code to clusterize the
configuration framework, I noticed this code in there:

 # [Send id=val to everyone else]

 Broker::publish(change_topic, Config::cluster_set_option, ID, val, 
location);

 if ( Cluster::local_node_type() != Cluster::MANAGER )
 Broker::relay(change_topic, change_topic, Config::cluster_set_option, 
ID, val, location);

It took me a bit to understand that ... The goal here is that a change
in a configuration value gets propagated out to all nodes in the
cluster. The Broker::publish() sends it to a node's immediate
neighbors, but not further. That means that for workers it goes (only)
to their manager; for the manager it means, it goes to all workers. If
we're not a manager, we then separately (through Broker::relay()) ask
our neighbors (that's the manager) to forward the change to *their*
neighbors (that's the other workers), without reraising it locally.

I remember we have discussed this API before, but I wanted to bring it
up again as I keep finding it confusing. I believe the code above
could be simplified by using the newer Broker::publish_and_relay(),
which was added to combine the two operations. Still, I'm realizing
now that I don't like thinking about this in terms of separate
publishing and relaying operations.

It all won't become easier once we add multi-hop routing to the mix
(which is in the works). And on top of all that, we also have
Cluster::publish_rr, Cluster::publish_hew, Cluster::relay_rr, and
Cluster::relay_hew -- another set of separate publishing & relay
options.

I'm wondering if we should give it another try to simply this API
while we still can (i.e., before 2.6 goes out). To me, the most
intuitive publish operation is "send to topic T and propagate to
everybody subscribed to that topic". I'd structure the API around
that, making that the main publish function for that simply:

Broker::publish(topic, args);

That would send to all neighbors, which then process locally and relay
to their neighbors. Right now, that would propagate just across one
hop but once we have multihop that'd start being broadcasted out
broadly.

To support the other use cases, we can then add modifiers & functions
to tweak this default, e.g.:

- Give publish() another argument "relay: bool =T" to prevent
  it from going beyond the immediate receiver. Or maybe instead:
  "relay_hops: int =-1" to specify the max number of hops
  to relay across, with -1 meaning no limit. (I recall concerns
  about loops being too easy to create; we could set the default
  here to F/0 to default to no forwarding, although conceptually I
  don't really like that :-)

- Give publish() another argument "relay_topic: string =""
  to change the topic when relaying on the 1st hop.

- Give publish() another argument "process_on_relays: bool =T"
  to change whether a relaying hop also sees the event locally.

- Add a second function publish_pool() that has all the same
  options, but receives a pool type instead of a topic (just an
  enum: RR, HRW).

What I'm not quite sure about is if some of these modifiers are better
to leave for the receiver to specify (e.g., whether to raise events
received on a given topic locally, or just forward). I think I can see
that either way.

Robin

-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Performance Issues after the fe7e1ee commit?

2018-06-12 Thread Robin Sommer



On Tue, Jun 12, 2018 at 14:16 -0500, you wrote:

> This has lead to the fix/workaround at [1], now in master, which

Cool, that indeed solved it! It also helps significantly when data
stores *are* being used; that still takes about 2x the time, but
that's much less than before. Now I'm wondering if we can get that
back down to normal, too ...

One question about Broker's endpoint::advance_time(): that's locking
on each call when in pcap mode, i.e., once per packet. Could we limit
that to cases where something actually needs to be done? Quick idea:
use an atomic<> for current_time plus another atomic<> counter
tracking if there are any pending message. And go into the locked case
only if the latter is non-zero?

> General changes/improvements in CAF itself may be warranted here

Yeah, sounds like it.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Performance Issues after the fe7e1ee commit?

2018-06-11 Thread Robin Sommer


On Fri, Jun 08, 2018 at 10:47 -0700, I wrote:

> Ok, I'll dig around a bit more and also double-check that I didn't
> change any settings in the meantime.

Confirmed that I'm still seeing that difference even when using the
options to turn the stores. Tried it on two different Fedora 28
systems, with similar results.

I haven't been able to nail down what's going on though. The
AdvanceTime() method does seem to do a lot of locking in pcap mode,
independent of whether there's actually any data store operations.
However, I tried a quick hack to get that down and that didn't change
anything.

I then ran it through oprofile. Output is attached. There's indeed
some locking showing up high in there, but I can't tell if that's just
expected idling in CAF. I am bit surprised to see a number of
std::chrono() methods showing rather prominently that I would expect
to be cheap.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
Using /home/robin/bro/master/tmp/oprofile_data/samples/ for samples directory.

WARNING! Some of the events were throttled. Throttling occurs when
the initial sample rate is too high, causing an excessive number of
interrupts.  Decrease the sampling frequency. Check the directory
/home/robin/bro/master/tmp/oprofile_data/samples/current/stats/throttled
for the throttled event names.


WARNING: Lost samples detected! See 
/home/robin/bro/master/tmp/oprofile_data/samples/operf.log for details.
warning: /hpsa could not be found.
warning: /kvm could not be found.
warning: /nf_conntrack could not be found.
warning: /nf_defrag_ipv4 could not be found.
warning: /nf_nat could not be found.
warning: /nf_nat_ipv4 could not be found.
warning: /tg3 could not be found.
CPU: Intel Sandy Bridge microarchitecture, speed 2000 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 10
samples  %image name   symbol name
48605 3.3738  kallsyms find_busiest_group
31108 2.1593  libtcmalloc.so.4.5.1 /usr/lib64/libtcmalloc.so.4.5.1
24088 1.6720  bro  RE_Match_State::Match(unsigned char 
const*, int, bool, bool, bool)
22833 1.5849  kallsyms native_write_msr
20525 1.4247  kallsyms finish_task_switch
20314 1.4100  kallsyms syscall_return_via_sysret
16822 1.1677  kallsyms __softirqentry_text_start
16520 1.1467  libcaf_core.so.0.15.7
caf::detail::double_ended_queue::lock_guard::lock_guard(std::atomic_flag&)
15112 1.0490  kallsyms update_blocked_averages
12897 0.8952  kallsyms run_timer_softirq
12890 0.8947  kallsyms pick_next_task_fair
12729 0.8836  libpthread-2.27.so   nanosleep
12495 0.8673  kallsyms update_curr
12209 0.8475  kallsyms _raw_spin_lock
12186 0.8459  libcaf_core.so.0.15.7caf::resumable* 
caf::policy::work_stealing::dequeue
 >(caf::scheduler::worker*)
11979 0.8315  kallsyms __schedule
11886 0.8250  kallsyms __update_load_avg_cfs_rq.isra.34
11463 0.7957  kallsyms idle_cpu
11239 0.7801  kallsyms __update_load_avg_se.isra.33
11178 0.7759  kallsyms native_sched_clock
11010 0.7642  kallsyms update_load_avg
10854 0.7534  libcaf_core.so.0.15.7
std::atomic::node*>::load(std::memory_order)
 const
10770 0.7476  kallsyms load_balance
10737 0.7453  libcaf_core.so.0.15.7decltype (({parm#1}->data)()) 
caf::policy::unprofiled::d 
>(caf::scheduler::worker*)
10582 0.7345  bro  DFA_State::Xtion(int, DFA_Machine*)
10554 0.7326  libcaf_core.so.0.15.7
caf::detail::double_ended_queue::take_head()
10185 0.7070  bro  RandTest::add(void const*, int)
10133 0.7034  libcaf_core.so.0.15.7
std::__uniq_ptr_impl::node, 
std::default_delete::node> 
>::_M_ptr()
9920  0.6886  libstdc++.so.6.0.25  /usr/lib64/libstdc++.so.6.0.25
9813  0.6811  kallsyms 
swapgs_restore_regs_and_return_to_usermode
9685  0.6723  kallsyms trigger_load_balance
9431  0.6546  libcaf_core.so.0.15.7std::tuple_element<0ul, 
std::tuple::node*, 
std::default_delete::node> > 
>::type& std::get<0ul, caf::detail::double_ended_queue::node*, 
std::default_delete::node> 
>(std::tuple::node*, 
std::default_delete::node> >&)
9404  0.6528  libcaf_core.so.0.15.7
caf::scheduler::worker::data()
9350  0.6490  kallsyms do_syscall_64
9311  0.6463  libcaf_core.so.0.15.7
std::enable_if::node*>
 >, 
std::is_move_constructible::node*>,
 
std::is_move_assignable::node*> 
>::value, void>::type 
std::swap::node*>(caf::detail::do

Re: [Bro-Dev] Performance Issues after the fe7e1ee commit?

2018-06-08 Thread Robin Sommer


Ok, I'll dig around a bit more and also double-check that I didn't
change any settings in the meantime.

Robin

On Fri, Jun 08, 2018 at 12:40 -0500, you wrote:

> On Fri, Jun 8, 2018 at 12:16 PM Jon Siwek  wrote:
> 
> > Only thing I'm thinking is that your test system maybe
> > does a poorer job of scheduling the right thread to run in order for
> > the FlushPendingQueries() spin-loop to actually finish.
> 
> Actually, realized you still had bad timing with data stores off, so
> it would more likely be a problem with the AdvanceTime() code path.
> There's some mutex locking going on there, but w/o data stores
> involved there shouldn't be anyone competing for them.
> 
> - Jon
> 


-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Performance Issues after the fe7e1ee commit?

2018-06-08 Thread Robin Sommer



On Thu, Jun 07, 2018 at 17:05 -0500, you wrote:

> Thanks, if you pull master changes [1] again there's likely some improvement.

Yeah, a little bit, not much though.

> # zcat 2009-M57-day11-18.trace.gz | time bro -r - tests/m57-long.bro
> Known::use_host_store=F Known::use_service_store=F
> Known::use_cert_store=F

That indeed gets it way down, though still not back to the same level
on my box:

170.49user 58.14system 1:55.94elapsed 197%CPU

(pre-master: 73.72user 7.90system 1:06.53elapsed 122%CPU)

Are there more places where Bro's waiting for Broker in pcap mode?

> With that, I get the same timings as from before pre-Broker.  At least
> a good chunk of the difference when using data stores is that, for
> every query, Bro will immediately block until getting a response back
> from the data store thread/actor.

Yeah, I remember that discussion. It's the trade-off between
performance and consistency. Where's the code that's doing this
blocking? Would it be possible to not block right away, but sync up
with Broker when events are flushed the next time? (Like we had
discussed before as a general mechanism for making async operations
consistent)

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Performance Issues after the fe7e1ee commit?

2018-06-07 Thread Robin Sommer
Hmm, I'm still seeing much larger runtimes on that M57 trace using our
test configuration, even with yesterday's change:

; Master, pre-Broker
# zcat 2009-M57-day11-18.trace.gz | time bro -r - tests/m57-long.bro
73.72user 7.90system 1:06.53elapsed 122%CPU (0avgtext+0avgdata 
199092maxresident)

; Current master
# zcat 2009-M57-day11-18.trace.gz | time bro -r - tests/m57-long.bro
2191.59user 1606.69system 12:39.92elapsed 499%CPU (0avgtext+0avgdata 
228400maxresident)

Bro must still be blocking somewhere when reading from trace files.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker has landed in master, please test

2018-05-24 Thread Robin Sommer


On Wed, May 23, 2018 at 20:16 +, Adam wrote:

> I think those question belong on the main list which is for using Bro
> and its language. This list is really more about development of Bro
> itself.

Just to give context here, the reason I sent the original mail about
Broker here, including the request for feedback, was to limit the
initial round of testing to folks quite familiar with Bro and its
internals. That gives us a chance to spot any obvious issues quickly
before annoying everybody else. :-) But discussing it at either place
is fine of course, whatever works best for folks. If things seem to
work, we should definitely also announce the merge more broadly.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Moving to GitHub?

2018-05-22 Thread Robin Sommer
> * Someone is likely to report the same problem again
> * There's clear directions on how to reproduce an undesired behavior
> * There been a proposed plan of action recently
> 
> And many tickets can be ruled out:
> 
> * Vague feature requests
> * Not enough details  / difficult to reproduce
> * Exceptionally old proposals / plans

Yeah, I'm on board with these. I'd probably interpret them more
conservatively than you towards closing more tickets, but that's fine.
As you have volunteered to take this one, I'd say you get to make the
call: just go through and port over what you think makes sense along
those lines. As one additional piece, let's also think about some tags
to use for classifying tickets, including one for what's good tasks
for newcomers who want to get into the code.

(In principle I also like Alan's suggestion of moving everything over
and then just close them out so that the history remains. But I'm
afraid that couldn't be automated easily and would then just be too
much work.)

> starting with a clean slate on GitHub now only means it's likely to
> eventually end up in the same situation as JIRA later.  What then?
> Move to another tracker again?

Doesn't need to be as drastic: as some people here can confirm, I
have no problem doing extensive sweeps if things get too overwhelming. :-)

But yes, point taken, my hope is that we can stay on top of things on
the new tracker and make an effort to get stuff addressed and
resolved. We'll see. :)

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] Broker has landed in master, please test

2018-05-22 Thread Robin Sommer
We merged the new Broker version into Bro master yesterday. As this a
major change to one of Bro's core components, I wanted to send a quick
heads-up here, along with a couple of notes.

With this merge we are completely replacing usage of Bro's classic
communication system with Broker. All the standard scripts have been
ported over, and BroControl has been adapted as well. The old
communication system remains available for the time being, but is now
deprecated and scheduled to be removed in Bro 2.7 (not 2.6). Broccoli
is now turned off by default.

With such a large change, I'm sure there'll be some more kinks to iron
out still; that's where we need everybody's help. If you have an
environment where you can test drive new Bro versions, please give
this a try. We're interested in any feedback you have, both specific
issues you encounter (best to file tickets) and general experiences
with the new version, including in particular any observations about
performance (best to send to this list).

>From a user's perspective, not much should even be changing, most of
the new stuff is under the hood. The exception are custom scripts
doing communication themselves, they need to be ported over to Broker.
Documentation for that is here:
https://www.bro.org/sphinx-git/frameworks/broker.html, including a
porting guide for existing scripts. Let us know if there's anything
missing there that would be helpful. The Broker library itself comes
with a new user manual as well, we'll get that online shortly.

One specific note on upgrading existing Bro clusters: the meaning of
"proxy" has changed. They still exist, but play a quite different role
now. If you're currently using more than one proxy, we recommend going
back to one; that'll most likely be fine with the standard scripts
(and if not, please let us know!)

Many thanks to Jon Siwek for the recent integration work tying up all
the loose ends and getting Broker mergeable. Also thanks to those who
have tested it already from the actor-system branch.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Moving to GitHub?

2018-05-18 Thread Robin Sommer


On Fri, May 18, 2018 at 08:27 -0500, you wrote:

> Doing a half-hearted effort to migrate tickets from JIRA undermines the goal
> of having an authoritative/central location for all code + tickets.  Can we
> instead try to deal with it once and for all?

What I was envisioning is more or less a clean slate: we'd migrate
over a few tickets, but essentially we'd start with an empty list. I
realize that sounds pretty harsh. However, I hardly ever see any
activity on older tickets in JIRA, and I generally believe that the
less open tickets a tracker has, the easier it is for people to
understand what's actually relevant and worth spending cycles on.
Tagging tickets may help, but in the end if everybody just filters 95%
out all the time anyways, I'm not sure what the value is.

That said, I'm open to a real porting effort if people do believe it's
helpful to get all the JIRA tickets into GitHub. What do others think?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Moving to GitHub?

2018-05-17 Thread Robin Sommer


On Thu, May 17, 2018 at 11:21 -0500, you wrote:

> * For porting over JIRA tickets to GitHub, "most recent" doesn't seem like a
> good metric to use.

Agree. :)

> they may as well just port all the older ones that are still valid
> over to GitHub.

That may be a bit too broad though. How about "still valid and either
(1) quite important or (2) something we expect will be addresses
reasonably soon"? We have many old tickets that are technically still
valid but unlikely to see any work anytime soon (otherwise they would
have been addressed already), and I'm worried that they would just add
noise without value. The old tickets won't go away, the JIRA will
remain. If something becomes relevant/active, we can always bring it
over at that time.

> I find myself in that situation quite often, actually, so
> transitioning to GitHub PRs, I wonder if we'd want a PR to be created
> against each individual repo?

Good point. Creating just one root PR that mentions the others sounds
good to me for such cases. 

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Moving to GitHub?

2018-05-16 Thread Robin Sommer


On Wed, May 16, 2018 at 15:23 +, you wrote:

> I too would miss the commit / change notifications, however, I think
> that this can be set up in GitHub in some way.

We can still get the same email notifications as today (which have a
bit more information that GitHub's standard ones), they will just come
with a little bit of a delay (within 10-15 minutes should be
reasonable). And if we want we can trigger that through webbhooks,
too, for immediate notification, would just need a bit of work to get
it set up.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Writing analyzer for Siemens PLC

2018-05-03 Thread Robin Sommer


On Wed, May 02, 2018 at 22:22 +0200, you wrote:

> 1) Reassembling packets: Some S7CommPlus packets which payload is over a 
> certain amount of bytes will be split and need to be reassembled.

As a couple quick pointers, the DNP3 and DTLS analyzers face a similar
task, you might find some ideas there.

>  If I want to generate a Bro events which contains the payload as a
>  parameter, how do I do that?

If with "payload" you mean the raw bytes, you would pass that as a
string into the event. But it's hard to do much with raw data that in
script-land. The common way would be instead creating one event per
type of payload and then raising the corresponding event as you parse
packets and find out what's in there.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] set and vector operators

2018-04-30 Thread Robin Sommer


On Mon, Apr 30, 2018 at 07:10 -0700, you wrote:

> Okay, I can live with this as long as '|' and '-' support add-to-set and
> remove-from-set.   But I think those have to work, given we'll enable them
> for operations on two sets.

Well, my vote then remains not adding new set operators for
add/delete, so that we don't have multiple ways to do the same thing.
Just looked at Python again, as a data point: That's what they do,
too. There are '|'/'&'/'-' for set/set operations, but no versions of
those for individual elements (they do that through methods instead;
add/delete are kind of our version of methods). Same for Ruby. I
looked around for a few more minutes for other languages, but didn't
immediately find any that even have any set operators at all (only
methods/functions for union/intersection/etc.).

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] set and vector operators

2018-04-30 Thread Robin Sommer


On Mon, Apr 30, 2018 at 07:10 -0700, you wrote:

> "vector(v op e)".  Wrapped in "vector(...)", the operation becomes the
> current semantics (apply "op e" separately to each element of v).

One additional piece of context here: That vector(...) syntax could
then be used more broadly in the sense of creating a different
semantic context for the operations inside. That kind of opens up a
whole new set of of type-specific operator meanings, without affecting
current/standard ones (it's like introducing R inside parentheses :-).
It's not the super-great, but at least it's explicit and we couldn't
come up with anything better if we want to have such operations as
operators. Might work for some other types as well.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Final Broker branch testing

2018-04-27 Thread Robin Sommer


On Thu, Apr 26, 2018 at 16:54 -0500, you wrote:

> (1) Users whose OS has insufficient CMake will need to compile/obtain a  
> newer one.

> (2) We go back to CMake 2.8.12 and have people compile CAF themselves. 
> (Or maybe we could conditionally require only 2.8.12 users to compile 
> CAF and others get the embedded CAF).
 
> (3) I need to try to hack our CMake system more to try to get back down 
> to 2.8.12 while still being able to embed CAF.

If there's something quick that ends up making (3) work, that'd be
ideal of course, but I don't think it's worth spending much time on,
given that there are reasonable ways to get a more recent cmake.

I wouldn't want to go back to not shipping CAF at all, but if we can
tell cmake that 2.8.12 is fine if users build CAF themselves, that
would be the 2nd best option I think. (1) ist worst case, which still
isn't too bad IMO, unless it does actually prevent us from building
binary packages for RH, that would be a problem.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] How to deal with stale branches?

2018-04-27 Thread Robin Sommer
Personally I don't really mind such branches sticking around for
reference purposes. We have plenty stale branches anyways all over, it
would probably be more to clean up those (looking at myself there, too :)

Robin

On Fri, Apr 27, 2018 at 03:01 +, you wrote:

> Yeah, that's certainly one option, but I think it'd be hard for people to
> find.
> 
> On Thu, Apr 26, 2018 at 8:15 PM, Jon Siwek <jsi...@corelight.com> wrote:
> 
> >
> >
> > On 4/26/18 11:06 AM, Vlad Grigorescu wrote:
> >
> > I'm torn between deleting the branches, in an effort to not clog up git
> >> with unneeded branches, and leaving them around or perhaps archiving them
> >> somewhere, in order to not completely lose the work in case it's of value
> >> to someone down the road.
> >>
> >> I'm curious if anyone has thoughts on the best way to proceed.
> >>
> >
> > Maybe delete the branch from the official git repo and push it to your own
> > github fork.
> >
> > - Jon
> >

> _______
> bro-dev mailing list
> bro-dev@bro.org
> http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev



-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Final Broker branch testing

2018-04-26 Thread Robin Sommer


On Thu, Apr 26, 2018 at 14:30 -0700, you wrote:

> It might be. I am honestly not sure - I suspect that this still will 
> mean that some places might not be able to easily use Bro 
> anymore--adding external package sources does not seem to be a viable 
> option everywhere.

Is it a feasible compromise to allow cmake 2.8 if we don't need to
build CAF? So either people have cmake 3.0 or they need to build CAF
themselves and say --with-caf=...?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] set and vector operators

2018-04-26 Thread Robin Sommer


On Wed, Apr 25, 2018 at 22:19 -0700, you wrote:

> Now there's no problem, since the lexer only recognizes ""
> as a unit, with no whitespace allowed.

Good idea, sounds right. And in case it did turn out to be
problematic, we could still go the way of adding all as keywords
later.

> How does that sound?

Sounds good to me, the bitwise operations will be great to have, too.

Just one more thing still: I'm actually feeling pretty strongly
against having multiple different operators for the same operation
(set union, set addition/removal). I just see that as leading to
confusion: scripts will inconsistently use on or the other, people
will have different preferences which version to prefer; they may not
even remember the other one. And we'd end up having to explain why
there are two versions, without having much of a great explanation
("one is ugly" doesn't sound great to me :-). Is it just me feeling
that way?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] set and vector operators

2018-04-25 Thread Robin Sommer


On Wed, Apr 25, 2018 at 10:40 -0700, you wrote:

>   s1 + s2 Set union (for sets of the same type, of course)
>   s1 || s2Set union

(What's the difference between the two? Or do you mean either one or
the other?)

Like Justin, I was also thinking "|" and "&" might be more intuitive.
"||"/"&&" is really typically associated with boolean contexts, and
other languages mgiht also coerce set operands into booleans in such a
context, so that, e.g., "s1 || s2" evaluates to true if either is
non-empty. 

I see the problem with the parser but maybe adding keywords is the way
to go.

>   s += e  Add the element 'e' to set 's'
>   (same as the current "add s[e]")
>   s -= e  Remove the 'e' element from 's', if present
>   (same as the current "delete s[e]")

I'd skip these. I don't think we should add an additional set of
operators for things that Bro already supports, that's seems confusing
to me (like Perl :)

>   s1 += s2Same as "s1 = s1 + s2"

(Or s1 |= s2 if we pick "|" for union.)

>   v += e  Append the element 'e' to the vector 'v'

That's probably the most requested Bro operator ever! :)

>   v += s  Append the elements of 's' to the vector 'v',
>   with the order not being defined

This one I'm unsure about. The point about the order being undefined
seems odd. If I don't care about order, wouldn't I stay with a set?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Overload Bro Events

2018-04-12 Thread Robin Sommer


On Thu, Apr 12, 2018 at 14:44 -0500, you wrote:

> > event overload%(c: connection%);
> > event overload%(c: connection, h: header%);
> > event overload%(c: connection, h: header, d: data%);
> 
> Overloading is not supported for functions in general (function/event/hook).

This has interesting implication for BIT-1431: if overloading worked
work, that could take the place of the  attribute suggested
there.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Broker port status

2018-03-08 Thread Robin Sommer


On Thu, Mar 08, 2018 at 13:28 -0600, you wrote:

> * Rename "proxy" nodes?

"compute" maybe?

> (1) , ,   Seems fine to deprecate these 
> now.

Agree.

> (2) Communication framework scripts.  It's awkward to deprecate stuff
> here since they internally will be using what is deprecated.  I think
> it makes sense to just remove it and let users manually make a copy if
> they still need it.

Likewise agree.

> (3) Old C++ comm. code, e.g. RemoteSerializer.  I think we'll leave
> this untouched for the 2.6 release?

Yep, I'm looking forward to ripping that out for 2.7. :)

> (4) BIFs associated with old comm. system.  Depends on (3) (and also
> actually (2)), though if the internal core code remains, I think it
> makes sense to add  to these.

Deprecating them sounds good.

> It makes more sense to me to assume the user wants to insert the key
> with a sane default value (e.g. zero/empty) in those cases, otherwise,
> it's awkward/race-prone to require they do it themselves.

Agree with this as well, has always felt a bit awkward to me too.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] Offline Broker usage (Re: [Bro-Commits] [git/bro] topic/actor-system: Fix Known scripts to be able to use alternate implemenation (50e1498))

2018-03-08 Thread Robin Sommer
Jon, I noticed your commit message on data store expiration:

> commit 50e1498d2b39d6af1f70dbc042ab544506a67e43
> Author: Jon Siwek <jsi...@corelight.com>
> Date:   Wed Mar 7 21:24:46 2018 -0600
> 
> Fix Known scripts to be able to use alternate implemenation
> 
> And run the external test suite using the alternate implementation
> due to data stores behaving differently when running on offline pcaps.
> E.g. expirations are based on wall time, not packet time, and timeouts
> (which *are* based on packet time) may occur when the store is still
> initializing due to a large interval of packet time passing.

That brings up an interesting question on data store semantics in
offline vs online mode. Ideally, there wouldn't be any difference
between the two operation modes, so that running on a trace gives
exactly the same results as online. That would match how Bro generally
operates. Could we make data store expiration driven by network time?
That'd need an API for Bro to drive Broker time forward. And for the
initialization, maybe Bro could wait for the initialization to finish?
Although I'm not quite sure here which initialization that refers too,
may not be feasible.

Are there other differences with stores between online and offline
operation?

Robin


-- 
Robin Sommer * Corelight, Inc. * ro...@corelight.com * www.corelight.com
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Shipping CAF with Broker?

2018-02-14 Thread Robin Sommer


On Wed, Feb 14, 2018 at 15:15 -0600, you wrote:

> > CAF headers are included in public broker headers.
> 
> Good point, I didn't remember that, it does complicate the situation.

Yeah, same here, I didn't think about that part either, it's
definitely a concern. Not immediately sure if there's a middle-ground
that would give us the best of both worlds: easy of installation for
Bro users, while remaining flexible for external Broker/CAF usages.
But agree that this needs more thought before moving ahead with
anything. Thanks for bringing that up, Dominik.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Queueing in Broker?

2018-02-14 Thread Robin Sommer


On Wed, Feb 14, 2018 at 17:39 +0100, you wrote:

> I think your use case is simple enough that we can make a few additions to 
> CAF and then implement this in Broker-land. Let me outline a solution here:

Yeah, that sounds like a good plan to me and should make the remaining
parts on the Broker-side pretty straight forward.

> This would have "at least once" semantics, so the receiving peer can
> receive messages twice for anything it already processed but didn’t
> have the chance to ACK. Just pointing it out.

Hmm ... Need to think about that. More than once could be a problem
for some use cases, we might need to add way to recognizes duplicates.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'async' update and proposal

2018-02-14 Thread Robin Sommer
Ok, agree that it's best to postpone "async". My gut isn't quite as
skeptical about its safety but I see the argument. (Although then
nobody will be allowed to complain about "when" anymore ;-)

Jon, if you want to think more about context/scoping it would be great
to keep the concurrency aspects of this in mind as well, as this could
eventually get us there, too. For more context here are a couple more
pointers to past ideas that eventually led to that CCS paper I had
pointed to earlier:

 - The scope-based concurrency model was originally described in
   Section 5.1 of this paper:
   http://www.icir.org/robin/papers/cc-multi-core.pdf.

 - I actually e found the old concurrency code from the Bro 1.x
   era). I'll send you a pointer to that. "grep ''" over
   those policy scripts yields the output at the end of this mail.

 - I now also remember that we indeed needed the internal tracking
   of the current context; just relying on event parameters wasn't
   sufficient. For concurrency the most tricky parts are events
   triggered indirectly, like through timers, as they will often
   need to follow the same scheduling constraints as the original
   one (not sure if that applies to "async").

Robin

- cut ---

superlinear/policy//bro.init:# =".
superlinear/policy//conn.bro:# determine_service() runs with =pair.
superlinear/policy//demux.bro:event _demux_conn(id: conn_id, tag: string, otag: 
string, rtag: string)  =connection(id)
superlinear/policy//dns.bro:msg: dns_msg, query: string) 
=connection(c$id)
superlinear/policy//firewall.bro:event report_violation(c: connection, r:rule) 
=connection(c$id)
superlinear/policy//ftp.bro:event add_to_first_seen_cmds(command: string) 
=custom(command)
superlinear/policy//hot.bro:event check_hot_event(c: connection, state: count) 
=connection(c$id)
superlinear/policy//icmp.bro:event update_flow(icmp: icmp_conn, id: count, 
is_orig: bool, payload: string) =hostpair(icmp$orig_h, icmp$resp_h)
superlinear/policy//interconn.bro:event _remove_from_demuxed_conn(id: conn_id) 
=connection(id)
superlinear/policy//nfs.bro:event nfs_request_getattr(n: connection, fh: 
string, attrs: nfs3_attrs) =custom(fh)
superlinear/policy//nfs.bro:event nfs_attempt_getattr(n: connection, status: 
count, fh: string) =custom(fh)
superlinear/policy//nfs.bro:event nfs_request_fsstat(n: connection, root_fh: 
string, stat: nfs3_fsstat) =custom(root_fh)
superlinear/policy//nfs.bro:event nfs_attempt_fsstat(n: connection, status: 
count, root_fh: string) =custom(root_fh)
superlinear/policy//notice-action-filters.bro:event 
notice_alarm_per_orig_tally(n: notice_info, host: addr) =originator(host)
superlinear/policy//notice.bro:# will run with  connection so that we can 
store notice_tags.
superlinear/policy//notice.bro:event NOTICE_conn(n: notice_info) 
=connection(n$conn$id)
superlinear/policy//portmapper.bro:event _do_pm_request(r: connection, proc: 
string, addl: string, log_it: bool) =originator(r$id$orig_h)
superlinear/policy//scan.bro:event do_rpts_check(c: connection, num: count) 
=hostpair(c$id$orig_h, c$id$resp_h)
superlinear/policy//scan.bro:event check_scan(c: connection, established: bool, 
reverse: bool) =originator(reverse? c$id$resp_h : c$id$orig_h)
superlinear/policy//scan.bro:event account_tried(c: connection, user: string, 
passwd: string) =originator(c$id$orig_h)
superlinear/policy//signatures.bro:event _do_signature_match_notice(state: 
signature_state, msg: string, data: string) 
=originator(state$conn$id$orig_h)
superlinear/policy//signatures.bro:event _do_count_per_resp(state: 
signature_state, msg: string, data: string) 
=responder(state$conn$id$resp_h)
superlinear/policy//signatures.bro:event _check_alarm_once(state: 
signature_state, msg: string, data: string) =custom(state$id)
superlinear/policy//trw-impl.bro:event add_to_friendly_remotes(a: addr) 
=originator(a)
superlinear/policy//trw-impl.bro:event check_TRW_scan(c: connection, state: 
string, reverse: bool) =originator(reverse? c$id$resp_h : c$id$orig_h)
superlinear/policy//weird.bro:event report_weird_conn_once(t: time, name: 
string, addl: string, c: connection, action: NoticeAction) =custom(name)
superlinear/policy//weird.bro:event net_weird(name: string) =custom(name)
superlinear/policy//worm.bro:global worm_list: table[addr] of count =0 
_expire = 2 days; #=originator;
superlinear/policy//worm.bro:=0 _expire = 2 
days _func=expi; # =originator;

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin


On Tue, Feb 13, 2018 at 11:02 -0600, you wrote:

> On Tue, Feb 13, 2018 at 9:44 AM, Robin Sommer <ro...@icir.org> wrote:
> 
> > Sounds like we all like that idea. Now the question is if we want to
> > wait for that to materialize (which will take quite a bit more
> > brainstorming and then i

Re: [Bro-Dev] Queueing in Broker?

2018-02-14 Thread Robin Sommer


On Tue, Feb 13, 2018 at 12:15 -0500, Seth wrote:

> As a producer:
> As a consumer:

Producer-side it should be easy to enforce limits, but consumer-side
it seems more difficult as it would need either some kind of a
handshake or a notion what data represents a buffered activity. Do you
think consumer-side is important? We already can not prevent a peer
from sending too much data during normal operation either.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Queueing in Broker?

2018-02-14 Thread Robin Sommer
On Tue, Feb 13, 2018 at 11:41 -0600, Jon wrote:

> If the goals is to prevent loss of data, then don't we need more than
> just buffering, like message acknowledgements from the peer?

Yeah, I wouldn't see it as bullet-proof reliability, rather as a best
effort "let's no needlesly drop stuff on the floor" kind of thing. I'm
thinking less here of the cluster setting (where things can get
complex and we'd usually restart everything anyways), and more of
external agents streaming stuff into Bro, like with the osquery
plugin. If one needs to restart the receiving-side Bro, it would be
nice to not just drop any activity reported in the meantime. With that
perspective, it would really just need just a bit of buffering of
messages that cannot be sent out right now. And if in the end they
still don't make it, that's not the end of the world.

> And if you still planned on message routing/auto-forwarding being more
> widely used, I think you would want to buffer the message while the
> longest subscribed *path* has a down node?

I was thinking to do the buffering at the routing/hop-level. The
messsage would get as far as it can at first. If a peer is down that a
node would have normally forwarded to, it'd buffer for a bit until
that comes back (but I realize this makes it even more fuzzy which
peers to wait for: in a flexible topology peers could come and go all
the time; see below).

That said, I'm now wondering if such buffering functionality should
really be located inside CAF, as that's in charge of low-level message
propagation.

> Yeah, I'm also unclear if there's anyway you can tell if the peer is
> supposed to be permanent vs. transient in come cases.

We could make that an explicit endpoint option: "for this peer, on
disconnect buffer stuff it would normally receive until it comes back
(subject to some limits)". We may need a better way to identify the
same peer though, just IP probably wouldn't work well. Maybe through
some ID/name sent during the handshake? One would need to configure
such a name for peers when turning on the buffering.

> Last observation is that I think any of these types of changes would
> be to the internal messaging pattern/protocol and so maybe reasonable
> to change/improve in subsequent releases in a way that's transparent
> to users.

Yeah, nothing to get in immediately, still needs some thinking. I'm
getting the sense though that we'll need it for some applications,
osquery being the main one on my mind. 

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Shipping CAF with Broker?

2018-02-14 Thread Robin Sommer
Sounds like everybody likes this idea. Jon, want to take a stab at it?
Seems like something we should do before merging the branch into
master so that we get everybody gets on the right track right away.

Let's try the the static library approach: link CAF statically into
libbroker. I'm not 100% sure if that's straight-forward to do, but I
hope so ...

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] Queueing in Broker?

2018-02-13 Thread Robin Sommer
One more Broker idea: I'm thinking we should add a queuing mechanism
to Broker that buffers outgoing messages for a while when a peer goes
down. Once it comes back up, we'd pass them on. That way an endpoint
could restart for example without us loosing data.

I'm not immediately sure how/where we'd integrate that. For outgoing
messages, we could add it to the transparent reconnect. However, for
incoming connections, where the local endpoint doesn't have a notion
of "that peer should be coming back", it might not be as straight
forward?

Robin



-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] Shipping CAF with Broker?

2018-02-13 Thread Robin Sommer
I was wondering the other day if we could add CAF as submodule to
Broker, and then just start compiling it along with everything else. A
long time ago we decided generally against shipping dependencies along
with Bro, but in this case it might make people's lives quite a bit
easier, as hardly any Bro user will have CAF installed already. And
even if they had (say from an earlier Bro version), it might not be
the right version. If we included it with Broker, things would just
work.

We could even go a step further and compile CAF statically into
libbroker, so that in the end from a user's perspective all they deal
with is Broker: if they link against it, they get everything they
need.

Would that make sense?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'async' update and proposal

2018-02-13 Thread Robin Sommer


On Thu, Feb 08, 2018 at 10:01 -0800, Johanna wrote:

> I just wanted to quickly chime in here to say that I generally like the
> idea of having these contexts.

Sounds like we all like that idea. Now the question is if we want to
wait for that to materialize (which will take quite a bit more
brainstorming and then implementation, obviously), or if we want to
get async in in the current state and then put that on the TODO list?
I can see arguments either way, curious what others think.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'async' update and proposal

2018-01-30 Thread Robin Sommer


On Tue, Jan 30, 2018 at 10:28 -0600, you wrote:

> Was there more benefit of using the predefined choice than saving the
> overhead of calling out to script-land to do the context calculation?

No, don't think so. It mainly came out of an analysis of existing
scripts, and those 5-tuple based subsets were the main use case
anways. Actually I'm not even sure anymore if there might have been a
custom execute-my-own-function scope as well, I'll see if I can find
the old code somewhere.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'async' update and proposal

2018-01-30 Thread Robin Sommer


On Tue, Jan 30, 2018 at 10:11 -0500, you wrote:

> I like this idea a lot!

Yeah, I like it, too. Additional benefit: it actually opens the door
for parallelization again, too ...


>   Do you foresee that causing trouble if we went that direction
>   though?  It seems like it could cause trouble by causing events to
>   backup waiting for some other event to finish executing.

It could. The async operations all time out, so there's a cap to how
long things can get stalled, but still: if that happens to many async
operations simultaneously, we could end up we lots of stuff in flight.
On the other hand, I don't think this can be avoided though: either we
want dependencies or we don't. You can't have the cake and it eat it
too I guess. :)

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'async' update and proposal

2018-01-30 Thread Robin Sommer


On Mon, Jan 29, 2018 at 13:58 -0600, you wrote:

>  And if 'function_call' starts as a synchronous function and later
>  changes, that's also kind of a problem, so you might see people
>  cautiously implementing the same type of code patterns everywhere
>  even if not required for some cases.

That's a good point more generally: if we require "async" at call
sites, an internal change to a framework can break existing code.

> event protocol_event_1(c: connection ...)  = { return c$id; } { ... }
> 
> I only skimmed the paper, though seemed like it outlined a similar way
> of generalizing contexts/scopes ?

Yeah, that's pretty much the idea there. For concurrency, we'd hash
that context value and use that to determine a target threat to
schedule execution too, just like in a cluster the process/machine is
determined.

An attribute can work if we're confident that the relevant information
can always be extracted from the event parameters. In a concurrent
prototype many years ago we instead used a hardcoded set of choices
based on the underlying connection triggering the event (5-tuple, host
pair, src IP, dst IP). So you'd write (iirc):

event protocol_event_1(c: connection ...)  = connection

That detaches the context calculation from event parameters, with the
obvious disadvantage that it can't be customized any further. May be
there's some middle ground where we'd get both.

(To clarify terminology: In that paper "scope" is the scheduling
granularity, e.g., "by connection". "context" is the current
instantiation of that scope (e.g., "1.2.3.4:1234,2.3.4.5:80" for
connection scope).

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'async' update and proposal

2018-01-29 Thread Robin Sommer
Jan wrote:

> First of all, this async keyword reminds me of asynchronous programming
> in C#:

Nice, didn't know that.

> For the C# async paradigm, people say that async is like a zombie
> plague as a single asynchronous function can start "infecting" your
> code base by propagating async through the call graph.

That's exactly my point. If we require a keyword to be used for all
asynchronous behavior, it will need to be put in place across the
whole call stack whenever there's even a the slightest chance of
"async" being used somewhere far down inside a framework. I hear you
all on the advantages of making asynchronous behavior explicit, I
fully agree with that. I just don't see it as practical from the
script writer's perspective.

Johanna wrote:

>  And if a user creates a function that in turn calls an asynchronous
>  function, I think we should require that function to be called using
>  async too. Either a user knows that a function uses an asynchronous
>  function, or the script interpreter will raise an error message
>  telling them that the async keyword is required here because an async
>  function is in the call stack.

The problem is that the interpreter cannot determine that statically
(because control flow isn't static), we'd have to resort to runtime
errors -- and that means that code that forgets to use "async" may run
fine for a while until it happens to go down that one branch that
does, e.g., a DNS lookup.

If we required that all the relevant functions (and function
delcarations) get declared as "async", like in C#, then I believe we
could detect errors statcially. But we'd end up having to put that
async declaration into a lot of places just on the chance that
asynchronous behavior could be used somewhere. Consider for example
the plugin functions in NetControl: They'd need to be "async" just so
that someone *could* do DNS lookups in there. Same for hooks: by
definition we don't know what they'll do, so they'll need to be
"async". And that in turn means that NOTICE() for example must become
"async" because it's running a hook. Now everytime we do a NOTICE, we
need to put an "async" in front. And everytime we call a function that
might generate a NOTICE, we'd write "async" there, too.

The point of dependencies/order becoming harder to understand is valid
of course. We already have that challenge with "when" and maybe we
need to find different solutions there to expresse sequentiality
requirements somehow.

Justin wrote:

> event protocol_event_1(...) =1
> event protocol_event_1(...)

> Currently the 2nd event handler is guaranteed to be ran only after the
> first finishes running, right?

Correct, and that's actually something we could ensure even with
"async": we could treat the whole set of all handlers as one block
that gets suspended as a whole if an asynchronous function runs. But
as you point out, that wouldn't solve inter-event dependencies. Per
Jan's mail, one can work around that with custom code, yet it would be
much nicer if we had built-in support for that. Actually, I think one
possible solution has been floating around for a while already: event
*scopes* that express serialization requirements in terms of shared
context. Most common example: serialize all events that are triggered
by the same connection. Originally this came up in the context of
running event handlers concurrently. I believe it would solve this
problem here too: when a function suspends, halt all handlers that
depend on the same context (e.g., same connection). More on that idea
in this paper: http://www.icir.org/robin/papers/ccs14-concurrency.pdf

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] 'async' update and proposal

2018-01-26 Thread Robin Sommer

A while ago we've been discussing adding a new "async" keyword to run
certain functions asynchronously; Bro would proceed with other script
code until they complete. Summary is here:
https://www.bro.org/development/projects/broker-lang-ext.html#asynchronous-executions-without-when

After my original proof-of-concept version, I now have a 2nd-gen
implementation mostly ready that internally unifies the machinery for
"when" and "async". That simplifies the code and, more importantly,
makes the two work together more smoothly. The branch for that work is
topic/robin/sync, I'll clean that up a bit more and would then be
interested in seeing somebody test drive it.

In the meantime I want to propose a slight change to the original
plan. In earlier discussions, we ended up concluding that introducing
a new keyword to trigger the asynchronous behaviour is useful for the
script writer, as it signals that semantics are different for these
calls. Example using the syntax we arrived at:

event bro_init()
{
local x = async lookup_hostname("www.icir.org"); # A
print "IP of www.icir.org is", x;# B
}

Once the DNS request is issued in line A, the event handler will be
suspended until the answer arrives. That means that other event
handlers may execute before line B, i.e., execution order isn't fully
deterministic anymore. The use of "async" is pointing out that
possibility.

However, look at the following example. Let's say we want to outsource
such DNS functionality into a separate framework, like in this toy
example:

# cat test.bro
@load my-dns

event bro_init()
{
local x = MyCoolDNSFramework::lookup("www.icir.org"); # A
print "IP of www.icir.org is", x; # B
}

# cat my-dns.bro
module MyCoolDNSFramework;

export {
global lookup: function(name: string) : set[addr];
}

function lookup(name: string) : set[addr] {
local addrs = async lookup_hostname(name); # C
return addrs;  # D
}

That example behaves exactly as the 1st: execution may suspend between
lines A and B because the call to MyCoolDNSFramework::lookup()
executes an asynchronous function call internally (it will hold
between C and D). But in this 2nd example that behaviour is not
apparent at the call site in line A.

We could require using "async" in line A as well but that would be
extremely painful: whenever calling some function, one would need to
know whether internally the callee may end up using "async" somewhere
(potentially buried further deep inside its call stack).

I think we should instead just skip the "async" keyword altogether.
Requiring it at some places, but not others, hurts more than it helps
in my opinion. The 1st example would then just go back to look like
this:

  event bro_init()
{
local x = lookup_hostname("www.icir.org"); # A
print "IP of www.icir.org is", x;  # B
}

This would still behave the same as before: potential suspension
between A and B.

I don't think skipping "async" this would be a big deal for anything,
as the cases where the new behaviour may actually lead to significant
differences should be rare.

Thoughts?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'for loop' variable modification

2018-01-09 Thread Robin Sommer


On Fri, Jan 05, 2018 at 19:28 -0600, you wrote:

> Robin has some ongoing work with adding better support for async
> function calls, and I wonder if the way that's done would make it
> pretty simple to add general coroutine support as well.

Yes, actually it would, pretty sure we could use that infrastructure
for a yield keyword.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [Bro-Commits] [git/broker] topic/actor-system: Add broker::bro::RelayEvent message type. (ce86016)

2017-12-15 Thread Robin Sommer


On Thu, Dec 14, 2017 at 15:27 -0600, you wrote:

> or proxy/data to all other proxy/data nodes (within these classes,
> nodes aren't connected with each other) was frequent enough to warrant
> putting in a single, simple function call like this.

Yeah, though then I'd actually say it's worth keeping multi-hop
topologies in mind at least. We don't need to fully solve this right
now; nobody really knows yet how topologies may end up looking in the
future. But a good rule of thumb I think is considering that generally
nodes may not be reachable directly, but only through other Broker
hops, and that Broker takes care of the routing to get messages there
as long as topic subscriptions are set up appropiately. Seems that's
actually what we have here already for some nodes. Could we use
Broker-level routing here to get connectivity between the nodes that
aren't directly connected?

> At least currently, Bro has the forwarding capability of Broker turned
> off.  IIRC, it was very easy to unintentionally create routing loops
> when it was turned on and when I asked about it, no one gave any
> justification or use-case for it, thus it got turned off.

It's off because we indeed don't really have understood yet how to use
it. :) But Broker has message TTLs already for detecting loops, and
generally I don't think there's anything preventing us from turning it
on; it should work.

That all said, the priority right now is on getting a basic cluster to
work, so no problem doing the relaying as you have it now if that gets
us there the quickest. Let's just keep thinking about how such
mechanisms will look in the future in other topologies, so that we
don't back us into a corner (and I hear your of course: to really do
that, we need to understand better where we want to get to).

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [Bro-Commits] [git/broker] topic/actor-system: Add broker::bro::RelayEvent message type. (ce86016)

2017-12-14 Thread Robin Sommer
Hi Jon,

I'm curious what's the use case for this? Generally I think it's best
to use a combination of Broker-internal forwarding and topics to
control where messages get propagated too, as that will better
generalize to larger topologies later. The less of the routing we
control manually at the Bro level, the better I'd say. But that's not
an absolute rule of course, and I may just be missing what this is
aiming at.

Robin

On Thu, Dec 14, 2017 at 12:55 -0600, Jonathan Siwek wrote:

> Repository : ssh://g...@bro-ids.icir.org/broker
> On branch  : topic/actor-system
> 
> >---
> 
> commit ce860168961285c6d961973aa9e1ef6b7de87887
> Author: Jon Siwek <jsi...@corelight.com>
> Date:   Thu Dec 14 12:55:52 2017 -0600
> 
> Add broker::bro::RelayEvent message type.
> 
> This is meant to be a more convenient/controlled/explicit way of doing
> simple one-hop message forwarding.
> 
> 
> >---
> 
> ce860168961285c6d961973aa9e1ef6b7de87887
>  broker/bro.hh | 36 
>  1 file changed, 32 insertions(+), 4 deletions(-)
> 
> diff --git a/broker/bro.hh b/broker/bro.hh
> index 0d7f0e7..0ec93f2 100644
> --- a/broker/bro.hh
> +++ b/broker/bro.hh
> @@ -18,6 +18,7 @@ public:
>  LogWrite = 3,
>  IdentifierUpdate = 4,
>  Batch = 5,
> +RelayEvent = 6,
>};
>  
>Type type() const {
> @@ -50,7 +51,7 @@ protected:
>  class Event : public Message {
>public:
>Event(std::string name, vector args)
> -: Message(Message::Type::Event, {name, std::move(args)}) {}
> +: Message(Message::Type::Event, {std::move(name), std::move(args)}) {}
>Event(data msg) : Message(std::move(msg)) {}
>  
>const std::string& name() const {
> @@ -62,6 +63,30 @@ class Event : public Message {
>}
>  };
>  
> +/// A Bro relayed event (automatically republished after a single hop).
> +class RelayEvent : public Message {
> +  public:
> +  RelayEvent(set relay_topics, std::string name, vector args)
> +: Message(Message::Type::RelayEvent, {std::move(relay_topics),
> +   std::move(name),
> +   std::move(args)})
> + {}
> +  RelayEvent(data msg) : Message(std::move(msg)) {}
> +
> +  const set& topics() const {
> +return get(get(msg_[2])[0]);
> +  }
> +
> +  const std::string& name() const {
> +return get(get(msg_[2])[1]);
> +  }
> +
> +  const vector& args() const {
> +return get(get(msg_[2])[2]);
> +  }
> +};
> +
> +
>  /// A batch of other messages.
>  class Batch : public Message {
>public:
> @@ -81,7 +106,8 @@ public:
>LogCreate(enum_value stream_id, enum_value writer_id, data writer_info,
>  data fields_data)
>  : Message(Message::Type::LogCreate,
> -  {stream_id, writer_id, writer_info, fields_data}) {
> +  {std::move(stream_id), std::move(writer_id),
> +   std::move(writer_info), std::move(fields_data)}) {
>}
>  
>LogCreate(data msg) : Message(std::move(msg)) {
> @@ -108,7 +134,8 @@ public:
>LogWrite(enum_value stream_id, enum_value writer_id, data path,
>  data vals_data)
>  : Message(Message::Type::LogWrite,
> -  {stream_id, writer_id, path, vals_data}) {
> +  {std::move(stream_id), std::move(writer_id),
> +   std::move(path), std::move(vals_data)}) {
>}
>  
>LogWrite(data msg) : Message(std::move(msg)) {
> @@ -131,7 +158,8 @@ public:
>  class IdentifierUpdate : public Message {
>  public:
>IdentifierUpdate(std::string id_name, data id_value)
> -: Message(Message::Type::IdentifierUpdate, {id_name, id_value}) {
> +: Message(Message::Type::IdentifierUpdate, {std::move(id_name),
> +     std::move(id_value)}) {
>}
>  
>IdentifierUpdate(data msg) : Message(std::move(msg)) {
> 
> 
> 
> ___
> bro-commits mailing list
> bro-comm...@bro.org
> http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-commits
> 




-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Feedback on configuration framework implementation

2017-12-01 Thread Robin Sommer


On Fri, Dec 01, 2017 at 15:22 +, you wrote:

> Now I'm glad I never got it to work right :-)

Me too. :-)

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Feedback on configuration framework implementation

2017-12-01 Thread Robin Sommer


On Thu, Nov 30, 2017 at 10:28 -0800, you wrote:

> think of that. I honestly also never liked modifying the values that are 
> passed in arguments; this is for example also theoretically possible for 
> events, but something that we have avoided to use in practice so far.

Yeah, and it also won't work for atomic values, at least not since
https://github.com/bro/bro/commit/5b889360705120c9061390214881ea376819c669
went in. :)

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b)

2017-11-09 Thread Robin Sommer
Sounds good to me. We should probably label the new parts experimental
for now, as I'm sure we'll iterate some more as people get experience
with them.

Robin

On Wed, Nov 08, 2017 at 18:46 +, you wrote:

> Just a quick summary of key points of this thread related to cluster-layout, 
> messaging patterns, and API (omitting some minor stuff from Robin’s initial 
> feedback).
> 
> - "proxy" nodes will be renamed at a later point toward the end of the project
>   ("proxy" actually makes sense to me, but "data" seems to have caught on
>   so I'll go w/ that unless there's other suggestions)
> 
> - "data" nodes will connect within clusters differently than previous "proxy"
>   nodes.  Each worker connects to every data node.  Data nodes do not connect
>   with each other.
> 
> - instead of sending logs statically to Broker::log_topic, there will now be
>   a "const Broker::log_topic_func = function(id: Log::ID, path: string) 
> "
>   to better support multiple loggers and failover use-cases
> 
> - add new, explicit message routing or one-hop relaying (e.g. for the simple
>   use-case of "broadcast from this worker to all workers”)
> 
> - add a more flexible pool membership API to let scripters define their own 
> data
>   pool constraints that users can then customize (outlined in previous email)
> 
> Let me know if I missed anything.
> 
> - Jon
> 
> _______
> bro-dev mailing list
> bro-dev@bro.org
> http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
> 


-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Scientific notation?

2017-11-06 Thread Robin Sommer


On Mon, Nov 06, 2017 at 09:16 -0500, you wrote:

> versions of awk and they all support scientific notation

I'm wondering if that's true for other log parsers as well. The main
thing I'd want to avoid is breaking people's existing scripts. We
could make it an option?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b)

2017-11-01 Thread Robin Sommer


On Tue, Oct 31, 2017 at 22:35 +, you wrote:

> My thought was they can conceptually still be used for the same type
> of stuff: data sharing and offloading other misc.
> analysis/calculation.

Yeah, agree that we want such nodes, however I would like to switch
away from the proxy name. "proxy" had a very specific meaning with the
old communication system and calling the new nodes the same would be
confusing I think. 

> I’m worried I missed a previous discussion on what people expect the
> new cluster layout to look like or maybe just no one has put forth a
> coherent plan/design for that yet?

Justin, correct me if I'm wrong, but I don't think this has ever been
fully fleshed out. If anybody wants to propose something specific, we
can discuss, otherwise I would suggest we stay with the minimum for
now that replicates the old system as much as possible and then expand
on that going forward.

> Yeah, could do that, but also don't really see the problem with
> exporting things individually.  At least that way, the topic strings
> are guaranteed to be correct in the generated docs.

Yeah, that's true, I was mostly thinking from the perspective of
having a concise API in the export section. But either way seems fine.

> the broadcast.  At least I don’t think there’s another way to send
> directed messages (e.g. based on node ID) in Bro’s current API, maybe
> I missed it?

Ah, I misunderstood the purpose of these messages.

If I remember right we can send direct messages at the C++ level and
could expose that to Bro; or we could have nodes subscribe to a topic
that corresponds to their node ID. But not sure either would make it
much different, so nevermind. 

> I might generally be missing some context here: I remember broker
> endpoints originally being able to self-identify with the friendly
> names, so these new hello/bye events wouldn’t have been needed, but it
> didn’t seem like that functionality was around anymore.

I actually don't remember. If we had it, not sure what happened to it.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b)

2017-10-31 Thread Robin Sommer
This is coming together quite nicely. Not sure if it's stable yet, but
I'll just go ahead with some feedback I noticed looking over the new
cluster API:

- One thing I can't quite tell is if this is still aiming to
  maintain compatibility with the old communication system, like
  by keeping the proxies and also the *_events patterns. Looking
  at setup-connections, it seems so. I'd say just go ahead and
  remove all legacy pieces. Maintain two schemes in parallel is
  cumbersome, and I think it's fine to just force everything over
  to Broker.

- Is the idea for the "*_topic" constants that one just picks the
  apppropiate one when sending events? Like if I want to publish
  something to all workers, I'd publish to Cluster::worker_topic?
  I think that's fine, though I'm wondering if we could compress
  the API there somehow so that Cluster doesn't need to export all
  those constants indvidiually. One idea would be a function that
  returns a topic based on node type?

- I like the Pools! How about moving Pool with its functions out
  of the main.bro, just for clarity.

- Looks like the hello/bye events are broadcasted by all nodes. Is
  that on purpose, or should that be limited to just one, like
  just the master sending them out? Or does it not matter and this
  provides for more redundancy?

- create_store() vs "stores": Is the idea that I'd normally use
  create_store() and that populates the table, but I could also
  redef it myself instead of using create_store() to create more
  custom entries? If so, maybe make that a bit more explicit in
  the comments that there're two ways to configure that table.

Robin


On Fri, Oct 27, 2017 at 12:44 -0500, Jonathan Siwek wrote:

> Repository : ssh://g...@bro-ids.icir.org/bro
> On branch  : topic/actor-system
> Link   : 
> https://github.com/bro/bro/commit/07ad06b083d16f9cf1c86041cf7287335a74ebbb
> 
> >---
> 
> commit 07ad06b083d16f9cf1c86041cf7287335a74ebbb
> Author: Jon Siwek 
> Date:   Fri Oct 27 12:44:54 2017 -0500
> 
> First-pass broker-enabled Cluster scripting API + misc.
> 
> - Remove Broker::Options, Broker::configure().  This was only
>   half implemented (e.g. the "routable" flag wasn't used), and using
>   a function to set these options was awkward: the only way to
>   override defaults was via calling configure() in a bro_init with
>   *lower* priority such that the last call "wins".  Also doesn't
>   really make sense for it to be a function since the underlying
>   broker library can't adapt to changes in these configuration
>   values dynamically at runtime, so instead there's just now
>   two options you can redef: "Broker::forward_messages" and
>   "Broker::log_topic".
> 
> - Add Broker::node_id() to get a unique identifier for the Bro instance's
>   broker endpoint.  This is used by the Cluster API to map node name
>   (e.g. "manager") to broker endpoint so that one can track which nodes
>   are still alive.
> 
> - Fix how broker-based communication interacts with --pseudo-realtime
>   and reading pcaps: bro now terminates at end of reading pcap when
>   broker is active (this should now be equivalent to how RemoteSerializer
>   worked).
> 
> - New broker-enabled Cluster framework API
>   - Still uses Cluster::nodes as the means of setting up cluster network
>   - See Cluster::stores, Cluster::StoreInfo, and Cluster::create_store
> for how broker data stores are integrated into cluster operation
> 
> - Update several unit tests to new Cluster API.  Failing tests at
>   the moment are mostly from scripts/frameworks that haven't been
>   ported to the new Cluster API.
> 
> 
> >---
> 
> 07ad06b083d16f9cf1c86041cf7287335a74ebbb
>  aux/broker |   2 +-
>  scripts/base/frameworks/broker/main.bro|  47 ++--
>  scripts/base/frameworks/broker/store.bro   |  14 +-
>  scripts/base/frameworks/cluster/__load__.bro   |   7 -
>  scripts/base/frameworks/cluster/main.bro   | 279 
> -
>  .../base/frameworks/cluster/setup-connections.bro  | 150 +++
>  scripts/base/frameworks/control/main.bro   |   1 +
>  src/broker/Manager.cc  |  20 +-
>  src/broker/Manager.h   |  15 +-
>  src/broker/comm.bif|  21 +-
>  src/iosource/PktSrc.cc |  20 +-
>  testing/btest/Baseline/plugins.hooks/output|  18 +-
>  .../manager-1..stdout  |   4 +
>  .../manager-1.reporter.log |  11 +-
>  testing/btest/broker/remote_log_types.bro  

Re: [Bro-Dev] design summary: porting Bro scripts to use Broker

2017-10-11 Thread Robin Sommer


On Mon, Oct 09, 2017 at 19:33 +, you wrote:

> It’s just a matter of where you expect most users to feel comfortable
> making customizations: in Bro scripts or in a broctl config file.

True, though I think that applies to much of Bro's configuration, like
the logging for example. Either way, starting with with script-only
customization and then reevaluate later sounds good.

> Maybe the key point is that these customizations only make sense to
> happen once before init time?

Yeah, that's right, changing store attributes afterwards seems
unlikely. From that perspective I get the redef approach. I was more
thinking about consistency with other script APIs. We use redef for
simple tuning (single-value options, timeouts, etc), but less these
days for more complex setups (see logging and input frameworks). I'd
be interested to hear what other people prefer.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] design summary: porting Bro scripts to use Broker

2017-10-06 Thread Robin Sommer
Nice!

On Fri, Oct 06, 2017 at 16:53 +, you wrote:

>   # contains topic prefixes
>   const Cluster::manager_subscriptions: set[string] 
> 
>   # contains (topic string, event name) pairs
>   const Cluster::manager_publications: set[string, string] 

I'm wondering if we can simplify this with Broker. With the old comm
system we needed the event names because that's what was subscribed
to. Now that we have topics, does the cluster framework still need to
know about the events at all? I'm thinking we could just go with a
topic convention and then the various scripts would publish there
directly.

In the most simple version of this, the cluster framework would just
hard-code a subscription to "bro/cluster/". And then scripts like the
Intel framework would just publish all their events to "bro/cluster/"
directly through Broker.

To allow for distinguishing by node type we can define separate topic
hierarchies: "bro/cluster/{manager,worker,logger}/". Each node
subscribes to the hierarchy corresponding to its type, and each script
publishes according to where it wants to send events to (again
directly using the Broker API).

I think we could fit in Justin's hashing here too: We add per node
topics as well ("bro/cluster/node/worker-1/",
"bro/cluster/node/worker-2/", etc.) and then the cluster framework can
provide a function that maps a hash key to a topic that corresponds to
currently active node:

local topic = Cluster:topic_for_key("abcdef"); # Returns, e.g., 
"bro/cluster/node/worker-1"
Broker::publish(topic, event);

And that scheme may suggest that instead of hard-coding topics on the
sender side, the Cluster framework could generally provide a set of
functions to retrieve the right topic:

# In SumStats framework:
local topic = Cluster::topic_for_manager() # Returns "bro/cluster/manager".
Broker::public(topic, event);

Bottom-line: If we can find a way to steer information by setting up
topics appropriately, we might not need much additional configuration
at all.

>   The old “communication” framework scripts can just go away as most
>   of its functions have direct corollaries in the new “broker”
>   framework.

Yep, agree.

> The one thing that is missing is the “Communication::nodes” table

Agree that it doesn't look useful from an API perspective. The Broker
framework may eventually need an equivalent table internally if we
want to offer robustness mechanisms like Justin's hashing.

> Broker Framework API
> 

I'm wondering if these store operations should become part of the
Cluster framework instead. If we added them to the Broker framework,
we'd have two separate store APIs there: one low-level version mapping
directly to the C++ Broker API, and one higher-level that configures
things like location of the DB files. That could be confusing.

>   Software::tracked_store = Broker::InitStore(Software::tracked_store_name);

I like this. One additional idea: while I see that it's generally the
user who wants to configure which backend to use, the script author
may know already if it's data that should be persistent across
execution; I'm guessing that's usually implied by the script's
semantics. We could give InitStore() an additional boolean
"persistent" to indicate that. If that's true, it'd use the
"default_backend" (or now maybe "default_db_backend"); if false, it'd
always use the MEMORY backend.

> # User needs to be able to choose data store backends and which cluster node 
> the
> # the master store lives on.  They can either do this manually, or BroControl
> # will autogenerate the following in cluster-layout.bro:

I don't really like the idea of autogenerating this, as it's pretty
complex information. Usually, the Broker::default_* values should be
fine, right? For the few cases where one wants to tweak that on a
per-store bassis, using a manual redef on the table sounds fine to me.

Hmm, actually, what would you think about using functions instead of
tables? We could model this similar to how the logging framework does
filters: there's a default filter installed, but you can retrieve and
update it. Here there'd be a default StoreInfo, which one can update.

> redef Broker::default_master_node = "manager";
> redef Broker::default_backend = Broker::MEMORY;
> redef Broker::default_store_dir = "/home/jon/stores";

Can the default_store_dir be set to some standard location through
BroControl? Would be neat if this all just worked in the standard case
without any custom configuration at all.

> BroControl Example Usage
> 

I'll skip commenting on this and wait for your response to the above
first, as I'm wondering if we need this BroControl functionality at
all.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] New CAF release for new Broker

2017-09-29 Thread Robin Sommer
For those tracking the new Broker version in the topic/actor-system
branch: A new CAF release 0.15.4 is out now that's incorporating
everything that code requires, so no need to use the CAF "develop"
branch anymore.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Configuration framework syntax proposal

2017-09-22 Thread Robin Sommer
I was thinking the same when discussing an earlier proposal with
Johanna. The "configopt" idea came out of that with the observation
that "const " isn't quite fitting here (and, as you say, it's
already blurred anyways). At that time, however, the thinking was
still to have a 2nd namespace, and writing 'configopt X: string
="a.b.c"' seemed a bit too much. But if we just go with a more
generic display name via Broxygen, then I'm back to liking it --
except maybe for the name, how about "option" instead of "configopt"?
So we'd arrive at something like this (similar to what has been said
already):

module Foo;

export {

## The username for our new feature.
##
## Display: User Name
option user_name: string;

}

And we could even start deprecating "const ... " if we wanted.

Robin

On Fri, Sep 22, 2017 at 15:59 +, you wrote:

> 
> > On Sep 21, 2017, at 11:50 AM, Johanna Amann <joha...@corelight.com> wrote:
> > 
> > The feature that the
> > configuration feature wants to bring is the ability to change options
> > during runtime - which cannot be accomplished with redefs. redef-able
> > consts will still have their place afterwards (for everything that still
> > cannot be changed during runtime).
> 
> Just had a misc. thought related to the use of ‘const’:
> 
> I remember first being confused/unfamiliar with Bro’s use of ‘const’ and 
> thought it meant “never changes” only to learn it’s further qualified as 
> “never changes at run-time” so that the ‘const’ + ‘’ combo ultimately 
> means “never changes at run-time, but initial value may be changed at 
> parse-time”.
> 
> Though, technically, ‘const’ can still be modified at run-time, if you know 
> how… e.g. send_id...
> 
> And that’s maybe all ok -- it’s easy to explain unfamiliar context as I did 
> above and the means of subverting runtime modification restrictions isn’t 
> actively advertised as such.  Though, with a new config framework, there’s 
> opportunities:
> 
> 1) could remove need for the backdoor method of modifying ‘const’ values at 
> runtime, (e.g. via send_id) as that’s done through new identifiers explicitly 
> tagged for config purposes
> 
> 2) using a new ‘configopt’ access modifier may be warranted over re-using 
> ‘const’ for configuration values as the semantics are now blatantly different 
> than what ‘const’ is expected to mean.  i.e. config values are meant to 
> change at run-time, but only via a restricted API and ‘const’ is still 
> intended to never change at run-time
> 
> - Jon
> 
> ___
> bro-dev mailing list
> bro-dev@bro.org
> http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
> 



-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Configuration framework syntax proposal

2017-09-21 Thread Robin Sommer


On Thu, Sep 21, 2017 at 14:51 +, you wrote:

> comments. Like Jan, I had a hard time understanding the benefit having
> two names for the same value: the identifier and config string.

Yeah, that's been my original concern as well. What if we focused that
new attribute just on displaying something to the user:

const user_name: string  _name="User name"

A UI would show it as "User name", but everything else (incl.
internally the configuration framework) would use
My_Program::user_name. This would even work more generically, anything
could have a _name and we'd have Broxygen pick up on it too.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 'async' keyword

2017-09-19 Thread Robin Sommer


On Tue, Sep 19, 2017 at 17:32 +, you wrote:

> Understanding the new code also requires understanding the context in
> which it is implemented and I wonder if the later is more of a hurdle
> here.

Hey, this is bro-dev, are you saying not everybody here is intimately
familiar with the Bro source code? ;-)

So yes, I'll allow for more time to understand the context. :) The
point though is that conceptually it's both simple and difficult at
the same time. Even without all the context, one might get the basic
idea pretty quickly if looking at the right parts---or maybe one
doesn't, I don't know, that's part of why I keep looking for feedback.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] 'async' keyword

2017-09-19 Thread Robin Sommer
At BroCon a few folks asked me about the proposed "async" keyword for
Bro's scripting language. "async" is coroutine-style language
construct that puts blocking operations on hold until they conclude,
working on other stuff first. It could replace most uses of "when" and
is arguably much nicer to use.

"async" is implemented as a proof-of-concept in the topic/robin/async
branch. Note that that Bro branch is not fully functional at the
moment, nor are performance implications clear. However, before going
any further with it I'd like to reach consensus if the current
implementation is acceptable for the Bro code base, as it's very
low-level and not easy to follow. I'd be interested in opinions here.

The commit to look at is:
https://github.com/bro/bro/commit/8653b333431648e5a33d61c3f61c6d093cff5b72

The exercise here is: Can you understand how "async" works? (If you 
can honestly answer "yes" in under 15 minutes, I buy you a beer. ;-)

Robin

PS: See the TODOs in that commit for the current state of the code.)

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] send_id (Re: [Bro-Commits] [git/bro] topic/jsiwek/actor-system: Finish port of control framework to use broker. (8dddae1))

2017-08-28 Thread Robin Sommer


On Mon, Aug 28, 2017 at 19:53 +, you wrote:

> Sounds ok.  Were you going to work on adding such a message or want me to try?

I can work on it, but it'll probably take me a few days to get to it.
If can make progress with other stuff in the meantime, I'll do that;
otherwise give it a try if you want.

> Generally for porting scripts to use Broker, I was probably going to
> end up doing the most straightforward thing I can think of to just get
> things to work and hope people follow commit messages well enough to
> call out anything questionable.

That works for me. The downside for you might be a bit of wasted time
if discussion afterwards leads to a different approach. So feel free
to also ask for feedback upfront, whatever seems best. Either way,
getting it to work by applying the most straightforward approach is
definitely the right rule of thumb.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] send_id (Re: [Bro-Commits] [git/bro] topic/jsiwek/actor-system: Finish port of control framework to use broker. (8dddae1))

2017-08-28 Thread Robin Sommer


On Mon, Aug 28, 2017 at 11:09 +0200, you wrote:

> Thanks for the clarification! I was thinking about send_id() in context 
> of the intel framework as well.

Yep, I meant Intel framework of course. :)

> So sending opaque values will still be possible using broker, right?

Yes, correct (one downside of opaque values is that only Bro itself
can send & receive them, for external Broker clients they will remain,
um, opaque :)

> As far as I understand the broker data stores (straight forward 
> key-value stores), a data store does not fit for the intel framework, as 
> it uses e.g. the patricia-trie implementation in tables to efficiently 
> match subnets.

Good point. Support for efficient subnet lookups is something we
should probably add to Broker stores at some point, but that's for the
future.

> 1. Sending all data at once. Maybe ok for that use case.

That would be ok for the new Intel client I think, but sending the
whole thing will put load on the sending Bro as well, which could be a
problem. It depends on the volume of the data of course, it's hard to
say where the limit is.

> 2. Sending stuff incrementally using some script-layer logic.

This might be the best way to go then. In the future I'd like to have
a script-level framework that offers some higher-level communication
patterns on top of Broker, like this one: "send large table safely".
For now, the Intel framework could implement that itself and then
maybe we can even reuse that implementation later by making it more
generic.

Robin


-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] send_id (Re: [Bro-Commits] [git/bro] topic/jsiwek/actor-system: Finish port of control framework to use broker. (8dddae1))

2017-08-26 Thread Robin Sommer
Jon, replacing send_id() may indeed work better with an extension at
the C++/Broker level. I'd like to avoid introducing new dependencies
on Bro's serialization code, as I'm very much hoping that once the old
communication code code goes we won't need that serialization layer
anymore either (I know we're using it for opaque values over Broker
too, but that's quite contained and should be easy to replace).

Some thoughts here:

 - I'm thinking the best approach may be a new Bro-specific
   message for Broker, similar to the log-create/write messages.
   We can add that to the Bro shim in Broker. It'd send the name
   of the ID and the new value as a broker::data instance, and the
   receiving Bro updates the value as the message comes in.

 - There's one larger problem with replacing send_id() though: the
   old communication system has logic to send large values
   incrementally, so that send_id() won't block stuff. As Seth
   reminded me the SumStats framework is relying on that quite
   extensively for sending tables around. Incremental operation is
   something we don't have with Broker. I think that's ok, we can
   replace the few existing use cases of sending large values with
   something else. For SumStats that should probably be data
   stores. I don't remember if there even are further instances of
   this, my guess is no (I don't think we need to consider
   broctl's configuration updating here; those values are small
   and a non-incrememtnal send_id() is fine).

 - Another approach for broctl's updating could be switching to
   the upcoming configuration framework, which takes a different
   approach to dynamic reconfiguration. However, it's a bit out
   still until we can that switch completely, so for now providing
   a substitute for send_id() that can cover the simple uses cases
   looks like the best way to go.

Robin

On Fri, Aug 25, 2017 at 13:15 -0500, Jonathan Siwek wrote:

> commit 8dddae17db9340f5261d11382aa9b67e965d5fef
> Author: Jon Siwek <jsi...@illinois.edu>
> Date:   Fri Aug 25 13:15:00 2017 -0500
> 
> Finish port of control framework to use broker.
> 
> To replace usages of the send_id BIF, I had to make 3 new BIFs
> 
> - serialize
> - deserialize
> - update_ID
> 
> Using those, any Bro value can be explicitly converted to a string
> of bytes, sent to a peer via a Broker event, unserialized to a Bro
> val on the other side and then installed into a global identifier
> via its name.
> 
> I think this may be the most straightforward way to get this to
> work for now, at least without changing significantly the Broker
> internals or messaging format.  It mostly boils down to not being
> able to deserialize into a Bro value with the 'any' type, at least
> not without also carrying Bro's actual type information somewhere
> inside Broker's serialized message.
> 
> And I think deserializing into 'any' would be needed because it's
> not possible to e.g. explicitly enumerate all possible types in a Bro
> script and have a particular event signature to use for any given one.
> That's not possible because there's infinite ways you can create
> composite types (tables, sets, vectors, etc).

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] 2.5.1 release?

2017-05-15 Thread Robin Sommer


On Sat, May 13, 2017 at 00:28 -0500, you wrote:

> We'll look at upgrading our test cluster (and UIUC's test cluster) to
> master.

Sounds good, let us know how that is going.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] can I send an opaque of bloomfilter over Cluster::manager2worker_event ?

2017-05-01 Thread Robin Sommer


On Mon, May 01, 2017 at 08:20 -0700, you wrote:

> Actually - I am not sure if we ever implemented consistent hashing over the
> cluster;

Ah, good point, we did implement that, but it needs to be configured:

## Seed for hashes computed internally for probabilistic data structures. 
Using
## the same value here will make the hashes compatible between independent 
Bro
q## instances. If left unset, Bro will use a temporary local seed.
const global_hash_seed: string = "" 

Robin

--
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] can I send an opaque of bloomfilter over Cluster::manager2worker_event ?

2017-05-01 Thread Robin Sommer

On Fri, Apr 28, 2017 at 17:55 -0700, you wrote:

> 1493427133.170419   Reporter::ERROR incompatible hashers in 
> BasicBloomFilter merge  (empty) -

> Not sure if the error is because an opaque of bloomfilter cannot be
> sent over worker2manager_events and manager2worker_events or if I am
> doing something not quite right.

Bloom filters should work over communication. What's the code for the
two sides? The error messages sound like these are filters of
different types.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] early performance comparisons of CAF-based run loop

2017-04-20 Thread Robin Sommer


On Fri, Apr 14, 2017 at 17:32 +, you wrote:

> Just mentioning it in case you didn’t account for the real fix also
> requiring the CAF-based loop being fully realized in addition to
> Broker

Yeah, true, I was thinking that eventually we will have this all solved.

>  (Also don’t have a sense of the frequency/urgency of the problem).

I think that's the main question. So far I haven't gotten the sense
that this really affects a lot of people, so I see the priority as
rather low given our limited cycles for development and testing. If
it's a more pressing problem, we can reconsider of course.

> So since I’ve been able to get the CAF-based loop working on offline
> pcap files (it does not rely on polling the FD of the open file since
> it didn't work anyway w/ CAF's epoll-based multiplexer on Linux), it
> may be fair to say that other packet sources that don’t
> require/support poll-ability should also be possible to integrate.

I need to think about that argument ... Did you try reading from files
while also doing communication (that would be pseudo-realtime mode),
or was the pcap the only source of input?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] early performance comparisons of CAF-based run loop

2017-04-19 Thread Robin Sommer


On Fri, Apr 14, 2017 at 07:56 -0700, you wrote:

> Just a quick comment here regarding FreeBSD: the native polling
> mechanism is kqueue, and CAF still lacks support for it [1].
> Fortunately, this is a rather straight-forward task.

Oh, sounds like that would be high-priority task then before we'd
consider moving to a CAF-based loop?

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] early performance comparisons of CAF-based run loop

2017-04-14 Thread Robin Sommer
Nice, thanks for the doing these measurements! I haven't looked at the
code yet, but some quick thoughts on your results and some of the
other comments this thread, and then some suggested next steps at the
end.

- Agree that overall your numbers suggest that all these mechanisms
  are fine performancewise, assuming we keep the optimization to batch
  packets between polls/selects to avoid the
  one-system-call-per-packet overhead.

- I don't think we should spend time anymore on improving the old
  communication code. We're getting close to retire that now and a
  number of its issues (like selects in the child process) will just
  go away with that. Let's focus on the new setting where Broker/CAF
  will be doing all communication.

- Regarding optimizing for different use cases: I would prefer
  avoiding having lots of knobs to configure the specifics of the
  loop. We have these magic values in the current I/O loop where
  nobody knows how to pick them because it's hard to understand their
  impact; and where folks have played with them, it was always hard
  conclude much about them beyond any specific setting. What we could
  try instead is a loop that adjusts itself based on load patterns: if
  the load is heavy on packets, build larger batches to process
  between polls; if input comes from lots different sources, increase
  the polling; etc. Any heuristic here would need to stay quite simple
  (otherwise we'd again end up not being able to predict much), but I
  think that'd be worth a try.

- Gilbert's point on high-performance IPC is a good one. I don't think
  we want to switch to direct memory access as our main model for the
  time being at least, but it does pose the question if/how can
  integrate packet sources that either don't need or don't support
  select/poll. (Which, in a nod to history, accounts for some of the
  complexities of the current loop because many years ago some pcaps
  didn't support select)


In terms of next steps, we need to see if these results hold across
different OSs, and also with live traffic. The two questions are (1)
does the new loop function on all platforms with both low- and
high-volume live traffic (presumably it will but that needs double
checking, given the history of weird OS-specific effects); and (2)
does performance match the measurements shown so far? If we can
confirm that on at least Linux and FreeBSD for, say, the two most
recent major releases of each and also consider common alternative
capturing solutions (pfring, netmap, afnet?), I'd be pretty
comfortable switching.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] how to compile bro plugin at the same time as bro

2017-04-10 Thread Robin Sommer


On Fri, Apr 07, 2017 at 11:11 +0100, you wrote:

> I would like to ask how to enable compilation and installation  of bro's
> plugin at the same time as the rest of bro is compiling/installing.

That's a good question, we don't have a mechanism for that yet.
Currently the assumption is that dynamic plugins are compiled
separately. It would indeed be nice to have the Bro CMake
configuration pick up further dynamic plugins to compile along.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [desired broker api as oppose to whats in known-hosts.bro]

2017-03-06 Thread Robin Sommer


On Mon, Mar 06, 2017 at 13:24 -0500, you wrote:

> I'm playing with rewriting some of the scripts for 2.6 right now with
> events because I think that the broker api is too complicated and
> could potentially have too many side effects. (like overuse of "when"
> for instance).

Yeah, that makes sense. We can switch back once we have something
better.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [desired broker api as oppose to whats in known-hosts.bro]

2017-03-06 Thread Robin Sommer


On Sat, Mar 04, 2017 at 00:21 +, you wrote:

> Broker::exists(Cluster::cluster_store, Broker::data("known_hosts"))
> Broker::lookup(Cluster::cluster_store, Broker::data("known_hosts"))
> Broker::set_contains(res2$result, Broker::data(host))
> Broker::add_to_set(Cluster::cluster_store, Broker::data("known_hosts"), 
> Broker::data(host));

The first step will be getting this down to (some version of) this
form instead:

Broker::exists(Cluster::cluster_store, "known_hosts")
result = Broker::lookup(Cluster::cluster_store, "known_hosts")
Broker::set_contains(result, host)
Broker::add_to_set(Cluster::cluster_store, "known_hosts"), host);

This should be relatively straight-forward to achieve once we have the
changes in that Matthias is currently working on.

And hopefully we can then also get rid of the "when" statement for
using these.

As reminder, this is the current proposal for next Broker steps:

https://www.bro.org/development/projects/broker-lang-ext.html

Whether we want to go further, like with Aashish's  proposal, we
can see then, I'm still a bit undecided there. Either way, a better
Bro-side UI for Broker is coming, we just need to get the various
pieces in place first that people are working on currently before we
can move forward.

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


[Bro-Dev] Going to delete old branches

2017-02-23 Thread Robin Sommer
%% bro/master/aux/broctl/aux/pysubnettree/cmake
topic/dnthayer/ticket1516
topic/dnthayer/ticket1733
topic/dnthayer/ticket1734
tpic/jsiwek/bif-loader-scripts
topic/jsiwek/broker
topic/jsiwek/homebrew-openssl
topic/jsiwek/jemalloc
topic/robin/dynamic-plugins-2.3
topic/robin/pktsrc
topic/robin/plugin-updates
topic/robin/reader-writer-plugins
topic/vladg/homebrew-openssl
%% bro/master/aux/broctl/aux/trace-summary
topic/dnthayer/ticket1297
topic/dnthayer/ticket1304
topic/dnthayer/ticket1571
topic/dnthayer/ticket1724
topic/dnthayer/ticket1730
topic/dnthayer/ticket1749
topic/dnthayer/ticket856
%% bro/master/aux/broctl/aux/trace-summary/cmake
topic/dnthayer/ticket1516
topic/dnthayer/ticket1733
topic/dnthayer/ticket1734
topic/jsiwek/bif-loader-scripts
topic/jsiwek/broker
topic/jsiwek/homebrew-openssl
topic/jsiwek/jemalloc
topic/robin/dynamic-plugins-2.3
topic/robin/pktsrc
topic/robin/plugin-updates
topic/robin/reader-writer-plugins
topic/vladg/homebrew-openssl
%% bro/master/aux/broctl/cmake
topic/dnthayer/ticket1516
topic/dnthayer/ticket1733
topic/dnthayer/ticket1734
topic/jsiwek/bif-loader-scripts
topic/jsiwek/broker
topic/jsiwek/homebrew-openssl
topic/jsiwek/jemalloc
topic/robin/dynamic-plugins-2.3
topic/robin/pktsrc
topic/robin/plugin-updates
topic/robin/reader-writer-plugins
topic/vladg/homebrew-openssl
%% bro/master/aux/broker
topic/python3-fix
%% bro/master/aux/broker/cmake
topic/dnthayer/ticket1516
topic/dnthayer/ticket1733
topic/dnthayer/ticket1734
topic/jsiwek/bif-loader-scripts
topic/jsiwek/broker
topic/jsiwek/homebrew-openssl
topic/jsiwek/jemalloc
topic/robin/dynamic-plugins-2.3
topic/robin/pktsrc
topic/robin/plugin-updates
topic/robin/reader-writer-plugins
topic/vladg/homebrew-openssl
%% bro/master/aux/btest
topic/dnthayer/max-lines
topic/dnthayer/mktemp
topic/dnthayer/py3-compat
topic/dnthayer/ticket1322
topic/dnthayer/ticket1722
topic/dnthayer/ticket1750
topic/dnthayer/ticket862
topic/robin/timing
%% bro/master/aux/netcontrol-connectors
%% bro/master/aux/plugins
topic/dnthayer/doc-improvements-2.4
topic/dnthayer/fix-typos
topic/dnthayer/ticket1536
topic/johanna/postgres
topic/robin/netmap
topic/robin/plugin-updates
topic/robin/rework-packets-merge
topic/robin/tcprs-merge-again
topic/vladg/es-fixes
%% bro/master/cmake
topic/dnthayer/ticket1516
topic/dnthayer/ticket1733
topic/dnthayer/ticket1734
topic/jsiwek/bif-loader-scripts
topic/jsiwek/broker
topic/jsiwek/homebrew-openssl
topic/jsiwek/jemalloc
topic/robin/dynamic-plugins-2.3
topic/robin/pktsrc
topic/robin/plugin-updates
topic/robin/reader-writer-plugins
topic/vladg/homebrew-openssl
%% bro/master/src/3rdparty
topic/jsiwek/file-signatures
topic/jsiwek/libmagic-integration
topic/jsiwek/new-libmagic

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Splitting up init-bare?

2017-02-12 Thread Robin Sommer


On Sun, Feb 12, 2017 at 00:20 -0500, you wrote:

> Would that work?  I know that internal and external plugins have some
> differences, but I don't know if that means we're limited in a bit in
> how we handle script land required data structures for analyzers.

For init-bare-style initialization code I was thinking the same, and
that's also partially where my __bare__.bro idea came from (actually
__init__.bro would be nicer I'm thinking now). I went back to the
plugin structure to see if we have the right mechanism there already,
but they work slightly different in terms of when required data
structures get initialized. But we could make those __init__.bro
scripts work in either case I think.

For a more general reorganization of moving scripts and code together,
I'm still torn on that. I like that in theory, but haven't convinced
myself yet that I'd like it in practice. :)

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Splitting up init-bare?

2017-02-10 Thread Robin Sommer


On Fri, Feb 10, 2017 at 11:51 -0600, you wrote:

> What do people think about splitting up portions of init-bare into separate
> files

Yeah, I can see that. It would be nice, though, if init-bare.bro
wouldn't need lots of @load statements then to refer to the individual
files. Maybe we could add some automatic way instead, like calling the
files __bare__.bro and have Bro find them automatically (but to
Johanna's point, not sure if that would cause load-order problems).

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] Improving Bro's main loop

2017-02-10 Thread Robin Sommer


On Fri, Feb 10, 2017 at 01:18 +, you wrote:

> if an IOSource needs to poll FDs it can just use poll() in its own
> actor/thread for now

Yeah, one basic decision we'll have to make is how much logic to move
into threads. Conceptually, that's the right thing to do, but we need
to make sure the code is thread-safe, and it generally makes
development and debugging harder in the future. CAF helps with all of
that, but all the legacy code worries me in that regard. That said,
the IOSources are pretty much self-contained and probably not very
problematic in that way. (But then: having some code needing to be
thread-safe, while other parts break every rule in the book in that
regard, is also confusing; we have that challenge already with the
logging and input frameworks.))

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


  1   2   3   4   5   6   7   8   9   10   >