Re: [DISCUSS] Stellar MAP data type features

2019-04-16 Thread Michael Miklavcic
Task 1 is complete - https://issues.apache.org/jira/browse/METRON-2071.
This work also included one additional function beyond the original
discussion, MAP_MERGE, for easier integration (and to support
multi-threading) with the flatfile summarizer. Tasks 2 and 3 can be done
independently from one another at this point. I'm inclined to focus on
https://issues.apache.org/jira/browse/METRON-2072 first in order to enable
some additional in-memory enrichment use cases. Feedback welcome.

On Fri, Apr 12, 2019 at 11:01 AM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Following up on this thread, I've created a few tickets around this
> functionality. I think this can be tackled in 3 parts
>
>1. Add a MAP_PUT function -
>https://issues.apache.org/jira/browse/METRON-2071
>2. Add syntactic sugar for map get/put to Stellar and Stellar shell -
>https://issues.apache.org/jira/browse/METRON-2072
>3. Create a use case that enables an in-memory enrichment loading via
>serialized maps from the flatfile summarizer -
>https://issues.apache.org/jira/browse/METRON-2073
>
> I already started on #1, and aside from some documentation updates and a
> few more unit/integration tests, it's basically done.
>
> #3 might already handle caching and the ability to update enrichments with
> OBJECT_GET, but I haven't looked that deeply into it yet. There will also
> probably be some use for a MAP_MERGE, though as I'm thinking on the fly
> about this, we may be able to massage the REDUCE function to this purpose -
> https://github.com/apache/metron/tree/master/metron-stellar/stellar-common#reduce
> .
>
> Best,
> Mike
>
>
> On Fri, Apr 5, 2019 at 2:04 PM Michael Miklavcic <
> michael.miklav...@gmail.com> wrote:
>
>> Hi all,
>>
>> We have a number of data structures and functions available via Stellar
>> for manipulating those structures. For example, we provide the following
>> functions for initializing and working with Bloom filters:
>> [Stellar]>>> ?BLOOM_
>> ?BLOOM_ADD  ?BLOOM_EXISTS  ?BLOOM_INIT  ?BLOOM_MERGE
>>
>> For Maps, we have some functions for dealing with existing maps, e.g
>> MAP_GET, MAP_EXISTS. We even have a MapReduce type of function, MAP. And
>> from an initialization perspective, you can create a map in Stellar with
>> what amounts to as an empty JSON object, e.g. mymap:={}. You can also turn
>> JSON into a proper map using TO_JSON_MAP.
>>
>> I don't, however, see a way to add to map object contents, e.g. PUT. So
>> the first question I have is whether this would functionality (assuming I
>> haven't missed something obvious that enables this in the first place) be
>> valuable to anyone, or if it's even something we should enable to begin
>> with. A vote against this option could be the potential for users
>> unwittingly loading up a ton of data and encountering OOM errors. This
>> would be bad for running topologies. The second question is whether this
>> should be a language feature or a specialized set of functions, e.g.
>> MAP_INIT,MAP_PUT, MAP_REMOVE
>> vs
>> mymap:={}
>> mymap['mike']:='miklavcic'
>> mymap:=FILTER(mymap , (x) -> x != 'mike')
>>
>> *Note*, from the above, the only thing we appear to be missing is the
>> ability to add to an existing map. Initializing and removing elements can
>> be handled by existing functionality.
>>
>> Best,
>> Mike
>>
>>


Re: [DISCUSS] Stellar MAP data type features

2019-04-12 Thread Michael Miklavcic
Following up on this thread, I've created a few tickets around this
functionality. I think this can be tackled in 3 parts

   1. Add a MAP_PUT function -
   https://issues.apache.org/jira/browse/METRON-2071
   2. Add syntactic sugar for map get/put to Stellar and Stellar shell -
   https://issues.apache.org/jira/browse/METRON-2072
   3. Create a use case that enables an in-memory enrichment loading via
   serialized maps from the flatfile summarizer -
   https://issues.apache.org/jira/browse/METRON-2073

I already started on #1, and aside from some documentation updates and a
few more unit/integration tests, it's basically done.

#3 might already handle caching and the ability to update enrichments with
OBJECT_GET, but I haven't looked that deeply into it yet. There will also
probably be some use for a MAP_MERGE, though as I'm thinking on the fly
about this, we may be able to massage the REDUCE function to this purpose -
https://github.com/apache/metron/tree/master/metron-stellar/stellar-common#reduce
.

Best,
Mike


On Fri, Apr 5, 2019 at 2:04 PM Michael Miklavcic <
michael.miklav...@gmail.com> wrote:

> Hi all,
>
> We have a number of data structures and functions available via Stellar
> for manipulating those structures. For example, we provide the following
> functions for initializing and working with Bloom filters:
> [Stellar]>>> ?BLOOM_
> ?BLOOM_ADD  ?BLOOM_EXISTS  ?BLOOM_INIT  ?BLOOM_MERGE
>
> For Maps, we have some functions for dealing with existing maps, e.g
> MAP_GET, MAP_EXISTS. We even have a MapReduce type of function, MAP. And
> from an initialization perspective, you can create a map in Stellar with
> what amounts to as an empty JSON object, e.g. mymap:={}. You can also turn
> JSON into a proper map using TO_JSON_MAP.
>
> I don't, however, see a way to add to map object contents, e.g. PUT. So
> the first question I have is whether this would functionality (assuming I
> haven't missed something obvious that enables this in the first place) be
> valuable to anyone, or if it's even something we should enable to begin
> with. A vote against this option could be the potential for users
> unwittingly loading up a ton of data and encountering OOM errors. This
> would be bad for running topologies. The second question is whether this
> should be a language feature or a specialized set of functions, e.g.
> MAP_INIT,MAP_PUT, MAP_REMOVE
> vs
> mymap:={}
> mymap['mike']:='miklavcic'
> mymap:=FILTER(mymap , (x) -> x != 'mike')
>
> *Note*, from the above, the only thing we appear to be missing is the
> ability to add to an existing map. Initializing and removing elements can
> be handled by existing functionality.
>
> Best,
> Mike
>
>