What is the best practice approach to share, across bolts, a Collection that will be used by many bolts each will perform a specific summarization and statistics calculation. The objective is to retrieve the collection only once , instead of retrieving from each for each bolt.
Should I just emit the collection from the intermediary bolt or is there a better way something like a internal cache ? The overall topology approach is , using fieldsGrouping: --- 1)KafkaSpout Receives the identifier(UUID) that will drive the retrieval of a collection of retail transactions. example: List<Transaction> 2) Bolt Retrieves and emitts (collector.emit) the collection of transactions that will be subjet to multiple calculations ( Is this correct or could cause a memory issue as the number of Bolts growth ?) 3) Around 6 other Bolts should use that same collection of transactions to execute different types of summarization and statistics calculation and write the metrics to Cassandra. --- Thanks IPVP