Re: DISCUSS: Quick change to parser config

Otto Fowler Mon, 04 Dec 2017 10:40:53 -0800

I’m not sure about consensus. I would like to see it summarized.

My point about assignment has to do with how many assignment like operators
we are going to support.  The fact that the assignment is to a variable
that is temporary or not doesn’t need to be part of the grammar/language,
 since all variable management is external in Stellar, that may not be
necessary.




On December 4, 2017 at 13:14:23, Simon Elliston Ball (
si...@simonellistonball.com) wrote:

Personally I suspect that temporary variable is a different thing as is the
assignment PR. Might be useful for intermediate steps in a parser, but then
we’re potentially getting more complex than a parser wants to be. I am
warming to the idea of temporary variables though.

In terms of the removal, I like the idea of the COMPLETE transformation to
express a projection. That makes the output interface of the metron object
more explicit in a parser, which makes governance much easier.

Do we think this is a good consensus? Shall I ticket it (I might even code
it!) in the transformation form proposed?

Simon

On 4 Dec 2017, at 17:21, Casey Stella <ceste...@gmail.com> wrote:

So, just chiming in here.  It seems to me that we have a problem with
extraneous fields in a couple of different ways:

* Temporary Variables

I think that the problem of temporary variables is one beyond just the
parser.  What I'd like to see is the Stellar field transformations operate
similar to the enrichment field transformations in that they are no longer
a map (this is useful beyond this case for having multiple assignments for
a variable) and having a special assignment indicator which would indicate
a temporary variable (e.g. ^= instead of :=).  This would clean up some of
the usecases in enrichments as well.  Combine this with the assumption that
all non-temporary fields are included in output for the field
transformation if it is not specified and I think we have something that is
sensible and somewhat backwards compatible.  To wit:
{
 "fieldTransformations": [
   {
     "transformation": "STELLAR",
     "config": [
       "ipSrc ^= TRIM(raw_ip_src)"
       "ip_src_addr := ipSrc"
     ]
   }
 ]
}

* Extraneous Fields from the Parser

For these, we do currently have a REMOVE field transformation, but I'd be
ok with a PROJECT or COMPLETE field transformation to provide a whitelist.
That might look like:
{
 "fieldTransformations": [
   {
     "transformation": "STELLAR",
     "config": [
       "ipSrc ^= TRIM(raw_ip_src)"
       "ip_src_addr := ipSrc"
     ]
   },
    {
     "transformation": "COMPLETE",
     "output" : [ "ip_src_addr", "ip_dst_addr", "message"]
   }
 ]
}

I think having these two treated separately makes sense because sometimes
you will want COMPLETE and sometimes not.  Also, this fits within the core
abstraction that we already have.

On Thu, Nov 30, 2017 at 8:21 PM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

Hmmm… Actually, I kinda like that.

May want a little refactoring in the back for clarity.

My question about whether we could ever imagine this ‘cleanup policy’
applying to other transforms would sway me to the field rather than
transformation name approach though.

Simon

On 1 Dec 2017, at 01:17, Otto Fowler <ottobackwa...@gmail.com> wrote:

Or, we can create new transformation types
STELLAR_COMPLETE, which may be more in line with the original design.



On November 30, 2017 at 20:14:46, Otto Fowler (ottobackwa...@gmail.com

<mailto:ottobackwa...@gmail.com <ottobackwa...@gmail.com>>) wrote:


I would suggest that instead of explicitly having “complete”, we have

“operation”:”complete”


Such that we can have multiple transformations, each with a different

“operation”.

No operation would be the status quo ante, if we can do it so that we

don’t get errors with old configs and the keep same behavior.


{
"fieldTransformations": [
{
"transformation": "STELLAR",
“operation": “complete",
"output": ["ip_src_addr", "ip_dst_addr"],
"config": {
"ip_src_addr": "ipSrc",
"ip_dest_addr": "ipDst"
} ,
{
"transformation": "STELLAR",
“operation": “SomeOtherThing",
"output": [“foo", “bar"],
"config": {
“foo": “TO_UPPER(foo)",
“bar": “TO_LOWER(bar)"
}
}
]
}


Sorry for the junk examples, but hopefully it makes sense.





On November 30, 2017 at 20:00:06, Simon Elliston Ball (

si...@simonellistonball.com <mailto:si...@simonellistonball.com
<si...@simonellistonball.com>>) wrote:


I’m looking at the way parser config works, and transformation of

field from their native names in, for example the ASA or CEF parsers, into
a standard data model.


At the moment I would do something like this:

assuming I have fields [ipSrc, ipDst, pointlessExtraStuff, message] I

might have:


{
"fieldTransformations": [
{
"transformation": "STELLAR",
"output": ["ip_src_addr", "ip_dst_addr", "message"],
"config": {
"ip_src_addr": "ipSrc",
"ip_dest_addr": "ipDst"
}
}
]
}

which leave me with the field set:
[ipSrc, ipDst, pointlessExtraStuff, message, ip_src_addr, ip_dest_addr]

unless I go with:-

{
"fieldTransformations": [
{
"transformation": "STELLAR",
"output": ["ip_src_addr", "ip_dst_addr", "message"],
"config": {
"ip_src_addr": "ipSrc",
"ip_dest_addr": "ipDst",
"pointlessExtraStuff": null,
"ipSrc": null,
"ipDst": null
}
}
]
}

which seems a little over verbose.

Do you think it would be valuable to add a switch of some sort on the

transformation to make it “complete”, i.e. to only preserve fields which
are explicitly set.


To my mind, this breaks a principal of mutability, but gives us much

much cleaner mapping of data.


I would propose something like:

{
"fieldTransformations": [
{
"transformation": "STELLAR",
"complete": true,
"output": ["ip_src_addr", "ip_dst_addr", "message"],
"config": {
"ip_src_addr": "ipSrc",
"ip_dest_addr": "ipDst"
}
}
]
}

which would give me the set ["ip_src_addr", "ip_dst_addr", "message”]

effectively making the nulling in my previous example implicit.


Thoughts?

Also, in the second scenario, if ‘output' were to be empty would we

assume that the output field set should be ["ip_src_addr", “ip_dst_addr”]?


Simon

Re: DISCUSS: Quick change to parser config

Reply via email to