Task 1 is complete - https://issues.apache.org/jira/browse/METRON-2071. This work also included one additional function beyond the original discussion, MAP_MERGE, for easier integration (and to support multi-threading) with the flatfile summarizer. Tasks 2 and 3 can be done independently from one another at this point. I'm inclined to focus on https://issues.apache.org/jira/browse/METRON-2072 first in order to enable some additional in-memory enrichment use cases. Feedback welcome.
On Fri, Apr 12, 2019 at 11:01 AM Michael Miklavcic < michael.miklav...@gmail.com> wrote: > Following up on this thread, I've created a few tickets around this > functionality. I think this can be tackled in 3 parts > > 1. Add a MAP_PUT function - > https://issues.apache.org/jira/browse/METRON-2071 > 2. Add syntactic sugar for map get/put to Stellar and Stellar shell - > https://issues.apache.org/jira/browse/METRON-2072 > 3. Create a use case that enables an in-memory enrichment loading via > serialized maps from the flatfile summarizer - > https://issues.apache.org/jira/browse/METRON-2073 > > I already started on #1, and aside from some documentation updates and a > few more unit/integration tests, it's basically done. > > #3 might already handle caching and the ability to update enrichments with > OBJECT_GET, but I haven't looked that deeply into it yet. There will also > probably be some use for a MAP_MERGE, though as I'm thinking on the fly > about this, we may be able to massage the REDUCE function to this purpose - > https://github.com/apache/metron/tree/master/metron-stellar/stellar-common#reduce > . > > Best, > Mike > > > On Fri, Apr 5, 2019 at 2:04 PM Michael Miklavcic < > michael.miklav...@gmail.com> wrote: > >> Hi all, >> >> We have a number of data structures and functions available via Stellar >> for manipulating those structures. For example, we provide the following >> functions for initializing and working with Bloom filters: >> [Stellar]>>> ?BLOOM_ >> ?BLOOM_ADD ?BLOOM_EXISTS ?BLOOM_INIT ?BLOOM_MERGE >> >> For Maps, we have some functions for dealing with existing maps, e.g >> MAP_GET, MAP_EXISTS. We even have a MapReduce type of function, MAP. And >> from an initialization perspective, you can create a map in Stellar with >> what amounts to as an empty JSON object, e.g. mymap:={}. You can also turn >> JSON into a proper map using TO_JSON_MAP. >> >> I don't, however, see a way to add to map object contents, e.g. PUT. So >> the first question I have is whether this would functionality (assuming I >> haven't missed something obvious that enables this in the first place) be >> valuable to anyone, or if it's even something we should enable to begin >> with. A vote against this option could be the potential for users >> unwittingly loading up a ton of data and encountering OOM errors. This >> would be bad for running topologies. The second question is whether this >> should be a language feature or a specialized set of functions, e.g. >> MAP_INIT,MAP_PUT, MAP_REMOVE >> vs >> mymap:={} >> mymap['mike']:='miklavcic' >> mymap:=FILTER(mymap , (x) -> x != 'mike') >> >> *Note*, from the above, the only thing we appear to be missing is the >> ability to add to an existing map. Initializing and removing elements can >> be handled by existing functionality. >> >> Best, >> Mike >> >>