Hi Casey, Thats make completely sense. Short question, if there is no enrichment or no profiling, does the message still pass through the enrichment/profiling topic?
If yes, do you think its possible to imagine a way that for messages that doesn't need enrichment or profiling to skip the topic and to go directly to the next one? This is again to avoid in/out in kafka. Thanks for the explaination, Michel 2018-06-23 3:58 GMT+01:00 Casey Stella <ceste...@gmail.com>: > Hey Michel, > > Those are good questions and there were some reasons surrounding that. In > fact, historically, we had fewer topologies (e.g. indexing and enrichment > were merged). Even earlier on, we had just one giant topology per parser > that enriched and indexed. The long story short is that we moved this way > because we saw how people were using metron and we gained more insight > tuning Metron. That led us down this architectural path. > > Some of the reasons that we went this way: > > - Fewer large topologies were a nightmare to tune > - Enrichment would have different memory requirements than, say, > parsers or indexing > - You can adjust the kafka topic params per topology to adjust the > number of partitions, etc. > - Having the separate topologies gives a natural set of extension points > for customization and enhancement (e.g. you want a phase between parsing > and enrichment). > - Decoupling the topologies lets us spin up and down parts of Metron > without affecting others (e.g. you don't have to take down enrichments > to > add a parser, even for a moment) > - The movement to Flux meant we were limited in how much we could adjust > the topology at runtime (e.g. colocating parsers and enrichment would > mean > moving away from flux essentially as the topology changes its structure) > > Best, > > Casey > > > On Fri, Jun 22, 2018 at 5:25 PM Michel Sumbul <michelsum...@gmail.com> > wrote: > > > Hi Everyone, > > > > I was asking myself what was the architectural reason to split the > > ingestion in metron in 4 differents toppologies that all read/write to > > kafka? > > > > For example, why the parsing and enrichment topologies have not been > > merged? Would it not be possible when you parse the message to directly > > enricht it? > > > > Im asking that because splitting in several topologies means that all of > > the topologies read/write to Kafka, which produce a bigger load on the > > kafka cluster and then a need for way more infrastructure/servers. The > cost > > is especially true when we speak about TBs of data ingested every day. > > > > Im sure there were a very good reason, I was just curious. > > > > Thanks, > > Michel > > >