Re: [DISCUSSION] METRON-1046 -> Stellar Files for multiple statement execution
Yes, if the files are also stored in ZK, Curator can watch them, but it would require extension work in our Curator usage. It currently manages a single tree cache. Managing free-floating files would require careful design work. Casey, my reference to JSON Pointers was itself the result of a 5-minute search; I inferred they might exist and searched for them :-) But they should at least be looked into before we roll our own, especially if they do happen to work with Curator. The initial pointers are: Structuring a complex schema — Understanding JSON Schema 1.0 ...<https://spacetelescope.github.io/understanding-json-schema/structuring.html> https://spacetelescope.github.io/understanding-json-schema/structuring.html Likewise in JSON Schema, for anything but the most trivial schema, it's really useful ... $ref can also be a relative or absolute URI, so if you prefer to include your ... RapidJSON: Pointer<https://www.google.com/url?sa=t=j==s=web=2=rja=8=0ahUKEwjllbnBrInVAhUhxoMKHbzpDcYQFggtMAE=http%3A%2F%2Frapidjson.org%2Fmd_doc_pointer.html=AFQjCNF1caePlOakrLfmwLDHt589CE-VtA> http://rapidjson.org/md_doc_pointer.html (This feature was released in v1.1.0). JSON Pointer is a standardized (RFC6901) way to select a value inside a JSON Document (DOM). This can be analogous ... RFC 6901 - JavaScript Object Notation (JSON) Pointer - IETF Tools<https://tools.ietf.org/html/rfc6901> https://tools.ietf.org/html/rfc6901 by M Nottingham - 2013 - Related articles<https://scholar.google.com/scholar?um=1=UTF-8=related:Hx52_JmfwB4xYM:scholar.google.com/> Abstract JSON Pointer defines a string syntax for identifying a specific value within a JavaScript Object Notation (JSON) document. Status of This Memo This is ... JSON API — Latest Specification (v1.0)<https://www.google.com/url?sa=t=j==s=web=4=rja=8=0ahUKEwjmxbjarYnVAhXn7oMKHbZwBb4QFghCMAM=http%3A%2F%2Fjsonapi.org%2Fformat%2F=AFQjCNHbw71ba-0s-bXqK-IX6w0LK0PEvg> http://jsonapi.org/format/ This page represents the latest published version of JSON API, which is currently 1.0 ... This section describes the structure of a JSON API document, which is ... [Note: this references JSON Pointer as a standard entity [RFC6901] but not as part of the JSON spec.] From: Otto Fowler <ottobackwa...@gmail.com> Date: Friday, July 14, 2017 at 10:42 AM To: Matt Foley <mfo...@hortonworks.com>, "dev@metron.apache.org" <dev@metron.apache.org> Subject: Re: [DISCUSSION] METRON-1046 -> Stellar Files for multiple statement execution I think the ‘files’ should be stored in zk, and updated with the same mechanism. On July 14, 2017 at 10:34:36, Casey Stella ceste...@gmail.com<mailto:ceste...@gmail.com> wrote: Just chiming in on a part of this: definitely we do not want to lose automatic config updates (at least, I'd be strongly, strongly STRONGLY against it). I definitely agree that JSON files could easily get unwieldy. I don't know anything about JSON pointers, could you cover that briefly, Matt? Even a URL or two to get started would be great. Basic googling (while on vacation) yielded that it was something like xpath for json, but I probably just googled the wrong thing. Casey On July 14, 2017 at 13:27:36, Matt Foley (mfo...@hortonworks.com<mailto:mfo...@hortonworks.com>) wrote: In the abstract, this is a good idea. I see it as related to METRON-987, which was the first step in allowing sequences of Stellar statements (aka "programs" :-) ) instead of just unrelated groups of single statements. Your proposal lets us really work with programs as first-class entities. However, some concerns need to be resolved: 1. Syntax. Currently Stellar syntax and JSON fit neatly together. Where would be the cut line for file substitutions? Referencing METRON-987, would you only allow a file substitution where we currently allow square-bracketed Stellar string sequences? What about Profile config syntax, where several chunks of code are intimately related (hence want to be located in the same file), but don't all get executed at the same time? (This is not a showstopper question because Profile configs are usually simple and don't really need file substitution. The need is much greater in Enrichment.) 2. Config Updates. Currently Metron configuration is stored in ZK, but managed through Curator libraries. In return for considerable complexity, this gives instant update whenever a config changes, without effort in the BI part of the application. This differs sharply from file-based configuration, where updates in response to config changes require either a restart, an explicit reload command from the user, or frequent state-checking in the application. So currently people trying to develop a new enrichment can update the config, and immediately test the result, without restarting and without any explicit reload command. We probably want to not lose this. Rat
Re: [DISCUSSION] METRON-1046 -> Stellar Files for multiple statement execution
A couple things I would like to point out. You can test Stellar statements without having to send data through parser/enrichment topologies. There is a REST endpoint that allows you to pass in a sample message and parser config and returns a message with Stellar statements applied. This could easily be expanded to enrichment configs or testing generic stellar statements against test messages. Moving statements to a separate file is going to require a lot of work and will make our mechanism for managing configuration in bolts more complex. We would have to also listen for changes in these files and reconcile which parser/enrichment configs are affected. On Fri, Jul 14, 2017 at 12:42 PM, Otto Fowlerwrote: > I think the ‘files’ should be stored in zk, and updated with the same > mechanism. > > On July 14, 2017 at 13:27:36, Matt Foley (mfo...@hortonworks.com) wrote: > > In the abstract, this is a good idea. I see it as related to METRON-987, > which was the first step in allowing sequences of Stellar statements (aka > "programs" :-) ) instead of just unrelated groups of single statements. > Your proposal lets us really work with programs as first-class entities. > > However, some concerns need to be resolved: > > 1. Syntax. > > Currently Stellar syntax and JSON fit neatly together. Where would be the > cut line for file substitutions? Referencing METRON-987, would you only > allow a file substitution where we currently allow square-bracketed Stellar > string sequences? What about Profile config syntax, where several chunks of > code are intimately related (hence want to be located in the same file), > but don't all get executed at the same time? (This is not a showstopper > question because Profile configs are usually simple and don't really need > file substitution. The need is much greater in Enrichment.) > > 2. Config Updates. > > Currently Metron configuration is stored in ZK, but managed through Curator > libraries. In return for considerable complexity, this gives instant update > whenever a config changes, without effort in the BI part of the > application. This differs sharply from file-based configuration, where > updates in response to config changes require either a restart, an explicit > reload command from the user, or frequent state-checking in the > application. > > So currently people trying to develop a new enrichment can update the > config, and immediately test the result, without restarting and without any > explicit reload command. We probably want to not lose this. > > Rather than roll our own file pointer model, can we use JSON Pointers? Will > they work with Curator? Both of those get into some fairly obscure > features, that would need to be studied. It also actually relates to the > syntax question presented above. > > > On 7/14/17, 6:17 AM, "Otto Fowler" wrote: > > https://issues.apache.org/jira/browse/METRON-1046 > > I was thinking this morning that managing stellar statements in the config > json could become, and maybe is kind of unwieldy. > To that end, if in say a parser configuration I can refer to a ‘file’ in > zookeeper as an alternative, we would add the capability to execute and > manage more complex statements, and even chain multiple statements > together. > > These files could be shared as well. > > This could be a Bad Idea™, so I thought I’d throw it out to the list. > > Please check out the jira, give some thought, and comment there or on the > list or both. > > O >
Re: [DISCUSSION] METRON-1046 -> Stellar Files for multiple statement execution
I think the ‘files’ should be stored in zk, and updated with the same mechanism. On July 14, 2017 at 13:27:36, Matt Foley (mfo...@hortonworks.com) wrote: In the abstract, this is a good idea. I see it as related to METRON-987, which was the first step in allowing sequences of Stellar statements (aka "programs" :-) ) instead of just unrelated groups of single statements. Your proposal lets us really work with programs as first-class entities. However, some concerns need to be resolved: 1. Syntax. Currently Stellar syntax and JSON fit neatly together. Where would be the cut line for file substitutions? Referencing METRON-987, would you only allow a file substitution where we currently allow square-bracketed Stellar string sequences? What about Profile config syntax, where several chunks of code are intimately related (hence want to be located in the same file), but don't all get executed at the same time? (This is not a showstopper question because Profile configs are usually simple and don't really need file substitution. The need is much greater in Enrichment.) 2. Config Updates. Currently Metron configuration is stored in ZK, but managed through Curator libraries. In return for considerable complexity, this gives instant update whenever a config changes, without effort in the BI part of the application. This differs sharply from file-based configuration, where updates in response to config changes require either a restart, an explicit reload command from the user, or frequent state-checking in the application. So currently people trying to develop a new enrichment can update the config, and immediately test the result, without restarting and without any explicit reload command. We probably want to not lose this. Rather than roll our own file pointer model, can we use JSON Pointers? Will they work with Curator? Both of those get into some fairly obscure features, that would need to be studied. It also actually relates to the syntax question presented above. On 7/14/17, 6:17 AM, "Otto Fowler"wrote: https://issues.apache.org/jira/browse/METRON-1046 I was thinking this morning that managing stellar statements in the config json could become, and maybe is kind of unwieldy. To that end, if in say a parser configuration I can refer to a ‘file’ in zookeeper as an alternative, we would add the capability to execute and manage more complex statements, and even chain multiple statements together. These files could be shared as well. This could be a Bad Idea™, so I thought I’d throw it out to the list. Please check out the jira, give some thought, and comment there or on the list or both. O
Re: [DISCUSSION] METRON-1046 -> Stellar Files for multiple statement execution
Just chiming in on a part of this: definitely we do not want to lose automatic config updates (at least, I'd be strongly, strongly STRONGLY against it). I definitely agree that JSON files could easily get unwieldy. I don't know anything about JSON pointers, could you cover that briefly, Matt? Even a URL or two to get started would be great. Basic googling (while on vacation) yielded that it was something like xpath for json, but I probably just googled the wrong thing. Casey On Fri, Jul 14, 2017 at 6:27 PM, Matt Foleywrote: > In the abstract, this is a good idea. I see it as related to METRON-987, > which was the first step in allowing sequences of Stellar statements (aka > "programs" :-) ) instead of just unrelated groups of single statements. > Your proposal lets us really work with programs as first-class entities. > > However, some concerns need to be resolved: > > 1. Syntax. > > Currently Stellar syntax and JSON fit neatly together. Where would be the > cut line for file substitutions? Referencing METRON-987, would you only > allow a file substitution where we currently allow square-bracketed Stellar > string sequences? What about Profile config syntax, where several chunks > of code are intimately related (hence want to be located in the same file), > but don't all get executed at the same time? (This is not a showstopper > question because Profile configs are usually simple and don't really need > file substitution. The need is much greater in Enrichment.) > > 2. Config Updates. > > Currently Metron configuration is stored in ZK, but managed through > Curator libraries. In return for considerable complexity, this gives > instant update whenever a config changes, without effort in the BI part of > the application. This differs sharply from file-based configuration, where > updates in response to config changes require either a restart, an explicit > reload command from the user, or frequent state-checking in the application. > > So currently people trying to develop a new enrichment can update the > config, and immediately test the result, without restarting and without any > explicit reload command. We probably want to not lose this. > > Rather than roll our own file pointer model, can we use JSON Pointers? > Will they work with Curator? Both of those get into some fairly obscure > features, that would need to be studied. It also actually relates to the > syntax question presented above. > > > On 7/14/17, 6:17 AM, "Otto Fowler" wrote: > > https://issues.apache.org/jira/browse/METRON-1046 > > I was thinking this morning that managing stellar statements in the > config > json could become, and maybe is kind of unwieldy. > To that end, if in say a parser configuration I can refer to a ‘file’ > in > zookeeper as an alternative, we would add the capability to execute and > manage more complex statements, and even chain multiple statements > together. > > These files could be shared as well. > > This could be a Bad Idea™, so I thought I’d throw it out to the list. > > Please check out the jira, give some thought, and comment there or on > the > list or both. > > O > > >