Github user cestella commented on a diff in the pull request:
https://github.com/apache/metron/pull/1099#discussion_r202803869
--- Diff:
metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java
---
@@ -182,40 +185,61 @@ public void prepare(Map stormConf, TopologyContext
context, OutputCollector coll
super.prepare(stormConf, context, collector);
messageGetStrategy = MessageGetters.DEFAULT_BYTES_FROM_POSITION.get();
this.collector = collector;
- if(getSensorParserConfig() != null) {
- cache =
CachingStellarProcessor.createCache(getSensorParserConfig().getCacheConfig());
- }
- initializeStellar();
- if(getSensorParserConfig() != null && filter == null) {
-
getSensorParserConfig().getParserConfig().putIfAbsent("stellarContext",
stellarContext);
- if
(!StringUtils.isEmpty(getSensorParserConfig().getFilterClassName())) {
- filter = Filters.get(getSensorParserConfig().getFilterClassName()
- , getSensorParserConfig().getParserConfig()
- );
+
+ // Build the Stellar cache
+ Map<String, Object> cacheConfig = new HashMap<>();
+ for (Map.Entry<String, ParserComponents> entry:
sensorToComponentMap.entrySet()) {
+ String sensor = entry.getKey();
+ SensorParserConfig config = getSensorParserConfig(sensor);
+
+ if (config != null) {
+ cacheConfig.putAll(config.getCacheConfig());
}
}
+ cache = CachingStellarProcessor.createCache(cacheConfig);
- parser.init();
+ // Need to prep all sensors
+ for (Map.Entry<String, ParserComponents> entry:
sensorToComponentMap.entrySet()) {
+ String sensor = entry.getKey();
+ MessageParser<JSONObject> parser =
entry.getValue().getMessageParser();
--- End diff --
So, the consequences of this decision are as follows:
* You share an expression cache (i.e. the statement -> abstract syntax tree
cache; distinct from the expression -> evaluated return cache)
* You share an stellar value cache (expression -> evaluated return)
* You share the state in the Context (e.g. hbase connections, zookeeper
connections).
On the whole, anything shared in the context is intended to be shared
across users and sensors by virtue of Stellar being used in the enrichment
topology (where it's not sensor-by-sensor), so we shoudl be ok there. The real
question is whether users would prefer to have one knob per topology for
stellar cache sizing or whether they would prefer to have one knob per sensor.
I'd say that I'm ok with how this PR is doing it, because it's easier to reason
about resources, IMO, on a per-topology perspective.
---